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ABSTRACT 

A major difficulty in predictions of school 
enrollments is the failure of the forecaster 'to express adequately 
his degree of certainty in his estimates. To alleviate this problem, 
a method was developed by which a forecaster could prepare 
probability distributions of enrollment predictions. A basic method 
of enrollment prediction was chosen and modified to accommodate 
probabilistic input and output. The method required separate 
estimates for such variables as migration, retention, and transfer; 
and it was modified to require three estimates (high, low, and most 
likely) for each variable. A Monte Carlo computer simulation program 
was written to combine these various estimates into probability 
distributions of enrollment prediction. (Appendix A, pages 150-161, 
may reproduce poorly. ) (Author/RA) 
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PROBABILISTIC SCHOOL ENROLLMENT PREDICTIONS USING 



MONTE CARLO COMPUTER SIMULATION 
Abstract 

A major difficulty in predictions of school enrollments 
is the failure of the forecaster to express adequately his 
degree of certainty in his estimates. To alleviate this 
problem a method was developed by which a forecaster could 
prepare probability distributions of enrollment predictions. 

A basic method of enrollment prediction was chosen 
and modified to accommodate probabilistic input and output; 
the method required separate estimates for variables such 
as migration, retention, and transfers. The method was 
modified to require three estimates for each variable: 
a high, a low, and a most likely estimate, with the high 
and the low estimates representing the 98 percent confidence 
interval. A Monte Carlo computer simulation program was 
written to combine these various estimates into probability 
distributions of enrollment prediction. Significance tests 
were performed to investigate predictive validity, reliability, 
and concurrent validity. Results indicated adequate reliability 
and validity, contingent upon additional tests, in all but 





one of the tests, a test of concurrent validity. The 
results of this test suggest the need for re-examining 
the assumptions about the distributions of the probabilistic 
input . 

The computer programs are ready for use although 
the user is encouraged to conduct additional tests of 
validity and reliability and to suggest improvements 
in the model. The model is unique among enrollment 
prediction methods in that it requires the user to examine 
the various parts of the system, to estimate probabilities 
for each of these parts, and to use the probabilities 
to determine probabilistic information about the operation 
of the system as a whole. 
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Chapter I 
Introduction 

Systems for predicting school enrollments constitute 
important decision making tools for school management. The 
decisions are important, not only because millions of dollars 
are involved, but because the decisions can greatly affect the 
quality of education school children receive. An under- 
estimate of enrollment may result in crowded classrooms or 
double sessions. Lawrence Derthick (1957) testified that 
double sessions cause children to lose up to two months of 
schooling a year with a corresponding drop in achievement. An 
overestimate may result in a loss of money; obvious inefficiency 
may make the community less willing to support future building 
plans . 

The primary purpose of most school enrollment predic- 
tions is to determine the extent, urgency, and immediacy of 
plant capacity needs. In planning new school buildings, 
decisions have to be made with respect to space requirements: 
short-range decisions on classroom space requirements and 
long-range decisions on special facilities such as cafeterias. 
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auditoriums, and playgrounds. Predictions of probable extent 
and timing of peak enrollment or changes in enrollment dis- 
tribution by grade indicate the need for flexibility in the 
school plant. Cost estimates must be made and the exact time 
and place for the erection of buildings must be chosen. 

Related questions are the amount of money required for the 
school buildings, district boundaries, grade level organi- 
zation, and probable future functions of the school plant 
(Larson & Strevell, 1952:65-66) . 

In addition to space requirements, short-range enrollment 
predictions are useful in making plans for obtaining specialists 
such as counselors or nurses, providing services for exceptional 
students such as the handicapped, insuring an adequate teacher- 
pupil ratio, and purchasing equipment such as teaching machines 
and language laboratory terminals. 

Besides providing informatics for planning a school 
building, the predictions can be useful in facilities planning 
for the school system as a whole. They provide a basis for 
predicting the amount of bonded indebtedness that will have to 
be incurred. Needed school sites can be anticipated; school 
sites and boundaries can be chosen for optimum distribution 
of the population and, if the predictions include the necessary 
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information, distribution of racial and ethnic groups. Pre- 
dictions can be used to indicate when use of substandard 
buildings may be discontinued. 

There is agreement among the forecasters and users of 
enrollment predictions that dependable predictions have a 
substantial influence on the future direction and quality 
of educational programs. Many also realize that predic- 
tions of school enrollments are a hazardous task; dependable 
results are difficult to guarantee. 

Two of the problems making the task difficult are the 
unpredictability of the phenomena and the inaccuracy of the 
prediction methods used. It is a major thesis of this study 
that another part of the difficulty is the lack of under- 
standing of the prediction results by the user. Often the 
user of the prediction figures is not the originator of 
these figures. Although school boards and administrators 
are most frequently the users of the predictions, the pre- 
dictions are often made by citizens' committees (e.g. , 
Population and Housing Committee, 1966; Citizens Advisory 
Committee on School Needs, 1960), by the research personnel 
of the school system, or by consultants (e.g., Marshall, 1968; 
Arthur D. Little, Inc., 1966). The user is often unaware 
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of the assumptions underlying the predictions. More often 
he is unaware of the degree of confidence the forecaster 
has in his predictions. The forecaster may be aware that 
the actual enrollment is unlikely to be exactly as predicted, 
but it is not uncommon for the user to place too much con- 
fidence in single figure predictions. The method of 
producing enrollment predictions developed in the present 
study was, to a large extent, designed to overcome the lack 
of communication between the forecaster and the user. 

Enrollment predictions may be made and presented as 
a single figure, as a high and a low figure, or as three 
or more figures based on different assumptions about the 
future enrollment. In the latter two cases, the figures 
may or may not be accompanied by an indication of the 
probability that the actual enrollment figure will fall 
on or between certain figures . A consideration in choosing 
one of these alternatives for the present study was the 
potential for communication to the user of the extent of 
forecaster certainty. A single figure may give the user 
unwarranted confidence in the dependability of the forecast 
(Stanbery, 1952:10). One solution to the communication 
problem was used by Marshall (1968) . He explained in words 

10 

o 

ERIC 

kmiManiaai 



* 



p 



- 5 - 



the uncertainties and the most probable departures from the 
predictions. He stated that "any future change in zoning, 
municipal services, residential development, or parochial 
schooling may send enrollment off in a direction hig er or 
lower than that projected herejp. A-l]." He also stated 
that his projections for elementary school enrollment for 
1975 and 1980 are "probably conservative [p. A-9] However, 
when the consultant is hired to produce prediction figures, 
such disclaimers do not have the same impact as do the pre- 
diction figures themselves. Furthermore, if the user is 
going to employ this added information in his planning, 
he must translate the words into approximate numbers or 
ranges of numbers. Assuming that it is the forecaster who 
has the information and expertise for making these numerical 
estimates, it would be desirable for him to present his 
prediction figures in such a way as to communicate his 
extent of certainty in his predictions or his estimates 
of probabilities of various enrollments or ranges of 
enrollment . 

Stanbery (1952:10-12) discussed two figure presentations 
(a high and a low figure, representing a probable range) 
and multiple figure presentations as methods of communicating 
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information to the user. He recommended use of two pre- 
dictions, one high and one low, rather than multiple pre- 
dictions: he reasoned that a presentation of multiple 

predictions is unwieldy and that it is difficult for the 
user to know which of the predictions is most likely to 
be realized. Multiple predictions are usually statistical 
computations of enrollments that would result under various 
assumptions. However, a probability distribution of pre- 
dictions would be a way in which the forecaster could 
communicate to the user his judgment of probabilities 
of various outcomes in a manner which is not "unwieldy." 
Stanbery (1952) wrote: 

The factors and conditions affecting population 
change... are so numerous and their effects are 
so varied that it would be impractical to assign 
proper weights to each of them and to compute 
mathematically the probabilities of realizing any 
one figure between the maximum and minimum pro- 
jections . [p. 45j . 

In the present study a method was developed for the purpose 
of computing probabilities for enrollment in a way which would 
not be "impractical." The method developed is essentially 
a modification of an existing method of enrollment prediction 
to accommodate probabilistic input and to produce probabilisti 
output. In order to modify the basic prediction method, 
computer simulation was used. The simulation produces 
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probability distributions of enrollments expressed in the 
format given in Table..- 1.1 (p. 8). 

The model which was used as the basis of the simulation 
is referred to as the "mu Invariable” method (Johnson, 1965) 
and has been used in various prediction studies (e.g.. Center 
for Field Studies, 1964; Center for Field Studies 1956). 

The model requires that separate predictions be made for 
major factors affecting school enrollment: births, migrations, 
retentions, transfers to and from nonpublic schools, school 
dropouts, and deaths. Enrollment predictions for a certain year 
and grade are computed by adjusting the enrollment figures for 
the previous year and grade by adding or subtracting, as 
appropriate , . the predicted values of each of the factors during the 
previous year to obtain an estimate for the year and grade under 
construction. This model was modified for use in the simulation 
by requiring the forecaster to estimate high, most likely, and 
low figures for each of the variables, with the high and low 
estimates representing the limits of the 98 percent confidence 
interval. For example, the high estimate of grade 8 migration 
between October 1, 1970 and October 1, 1971, may be 15; the most 
likely estimate, 10; and the low estimate, 5. These three 
numbers are assumed to describe a probability distribution. 
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TABLE 1.1 

PROBABILITY THAT TOTAL ENROLLMENT IN GRADE 1 
IN 1974 WILL BE LESS THAN THE SPECIFIED PRE- 
DICTED ENROLLMENT 
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PROBABILITY PREDICTED ENROLLMENT 



.05 


2023 


.10 


2100 


.20 


2171 


.30 


2244 


.40 


2316 


.50 


2384 


.60 


2429 


.70 


2513 


.80 


2549 


.90 


2653 


.95 


2735 



PROBABILITY THAT TOTAL 


ENROLLMENT IN GRADE 


1 


IN 1974 WILL BE GREATER 


THAN THE SPECIFIED 


PRE- 


DICTED ENROLLMENT 


PROBABILITY 


PREDICTED ENROLLMENT 


.05 


2735 




.10 


2653 




.20 


2549 




.30 


2513 




.40 


2429 




.50 


2384 




.60 


2316 




.70 


2244 




.80 


2171 




.90 


2100 




.95 . 


2023 
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The computer simulation is used as a means of combining the 
distributions of the various factors into probability dis- 
tributions of predicted enrollments. 

The multivariable prediction method was chosen because 
it allows the forecaster freedom to deviate from mere pro- 
jection of past data and allows him to make probability 
statements about individual variables such as migration. 

The literature on school enrollment prediction has very few good 
validity studies, but there are reasonable arguments for 
preferring the multivariable method to other recognized 
prediction methods (Whitla, 1952). 

The data used in the study consisted of enrollment 
predictions made in 1964 for the City of Brockton, 

Massachusetts. This use of previously made predictions 
allowed for a test of predictive validity; the accuracy 
of the prediction using the multivariable method was com- 
pared to that using the percentage of survival method, 
probably the most popular prediction method. A second 
significance test was used to test the reliability of the 
simulation output when the starting value of the random 
number generator is varied; the hypothesis was that there 
is no difference between output of the simulation using 
two different starting values for the random number generator. 
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Concurrent validity wa.v> investigated by comparing the non- 
simulation multivariable predictions to the distributions 
produced by the simulation using two different types of 
input data. 

In summary, the study developed a method for producing 
probability distributions of school enrollment predictions; 
a basic method for single figure predictions was chosen, 
and Monte Carlo computer simulation programs were written 
to produce multiple predictions in the form of distributions. 
Significance tests were performed to investigate predictive 
validity of the prediction model and reliability and concurrent 
validity of the simulation output. 
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Chapter II 

Review of the Literature 

We do not have methods which can predict precisely the 
enrollment in a certain grade and year. Our methods are not 
that refined; our prescience as to the future events involved 
is not that accurate. For a prediction study to be complete 
it should include an indication of the forecaster's certainty 
of the predictions. Whether or not the user is also the fore- 
caster, the user of the enrollment predictions should take 
account of the assumptions and uncertainties in making his 
plans. It is the contention here that estimation and commu- 
nication of this information is more precise, forceful, and 
useful if presented in numbers or graphs rather than in words. 

Stanbery (1952:10-12) discussed the relative advantages 
and disadvantages of single, double, and multiple figure pre- 
dictions for these purposes. Among the disadvantages he 
cited for single figure predictions are the unwarranted 
confidence in the accuracy of the prediction it gives the user, 
its failure to give the user any indication of the extent to 
which it might be in error, and the fact that it may be based 
on only one set of assumptions. He stated that double figure 
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predictions, one high and one low figure, meet these 
objections and allow the user to compare the assumptions 
on which the high and low predictions are based and to make 
his own judgment about probabilities of the low figure, the 
high figure, or some intermediate figure being realized. 
However, Stanbery also realized that it might indeed be a 
disadvantage to make it necessary for the user to exercise 
his judgment in this way. Advantages cited for the multiple 
figure predictions were much the same as those cited for double 
figure predictions. Multiple predictions are usually presented 
as statistical computations of the populations that would 
result under various assumptions about such variables as 
birth rate, death rate, and migration. He cited as a dis- 
advantage the fact that little guidance is usually given to 
the user about which of the assumptions are more likely to 
be correct, and many users get the impression that one of the 
intermediate figures is more likely to be realized than one 
of the higher or lower figures, an assumption which is not 
necessarily correct. Also, Stanbery described multiple pre- 
dictions as impractical and unwieldy, consideration of the 
foregoing advantages and disadvantages of the various methods 
led Stanbery to recommend the use of double figure predictions. 
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A type of prediction that Stanbery did not consider is 
a probability distribution of predictions. Through the 
addition of probabilities to the predictions the forecaster 
indicates to the user which sets of assumptions he feels are 
most likely to be correct. Of course, the forecaster can 
also report the basic assumptions on which the probabilities 
are based, giving the user an opportunity to review and pos- 
sibly modify the judgments of the forecaster. The probability 
distribution of predictions, however, takes the burden of 
choosing among various assumptions from the user and gives it 
to the forecaster, who presumably has access to more informa- 
tion. The present study is an investigation of a method for 
producing multiple figure predictions which are not "impracti- 
cal" and" unwieldy" by producing probability distributions of 
enrollment predictions using computer simulation. 

It is the present investigator's contention that there is 
a need for a satisfactory method of preparing predictions in 
the form of probability distributions although the application 
of probabilities to predictions has been reported in demo- 
graphic literature. Stanbery (1952:12) recommended that 
high and low figures be chosen to produce a range which can 
be expected with the probability of .50 or greater to contain 
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the actual population. Griffin and Schmitt (1966) proposed 
the use of Monte Carlo computer simulation to produce 
probabilistic output. Peters (1969) is developing a method 
of calculating confidence intervals for school enrollment 
projections, with the intervals being the 90 or 95 percent 
confidence intervals. This is perhaps the most extensive 
attempt to produce probabilistic output. However, Peters' 
method is based on the percentage of survival method, a method of 
forecasting which the present investigator considers unsatisfactory 
in that it fails to allow for changes in trends. 

in the present study the multivariable prediction method 
was modified to accommodate probabilistic input and output. 

Since the purpose of the study was not to develop new demo- 
graphic forecasting techniques, but to develop a way to account 
for probabilities in predictions, a previously existing pre- 
diction method was selected and modified. 

A review of the literature showed no single taxonomy of 
enrollment prediction method to be adequate as a basis for 
discussing alternative prediction methods. The lists found 
in the literature are not comprehensive. The problem is 
complicated by differences in terminology among the lists; 
in different lists the same name is used for different methods. 
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and different names are used for the same method. Methods 

listed separately by one writer are combined into one 

\ 

method by another. There is also confusion between methods 
for population prediction and methods for school enrollment 
prediction. An attempt is made here to describe some of the 
commonly named methods of enrollment prediction. The list 
is an incorporation and expansion of the classifications given 
by Peters (1969), Griffith (1964), Macconnell (1957), Strevell 
(1952), and the American Association of School Administrators 
(1947). One dimension for describing the methods is the 
amount of opportunity the forecaster has to depart from past 
trends in his forecasts. Methods which reply solely on data 
from past trends are referred to as projection methods; methods 
allowing deviation from past trends are called prediction 
methods. These terms were chosen by the investigator for 
convenience in the present study; sane demographers use the 
terms somewhat differently (Metropolitan Area Planning Council, 
1968:2 and Isard, 1960) . 

The methods which may be called projection methods rely 
on the trends of past enrollment figures. One method is that 
of fitting an equation to the curve of historical enrollment 
data ("projections by growth curve": Griffith, 1964:33-34). 

Collins and Langston (1961:10-12) listed three types of trend 
projections: (1) straight line, (2) average percent of increase, 
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and (3) average numerical increase. The straight line method 
consists of graphing the enrollment data for a grade or group 
of grades over a period of years and graphically projecting 
a straight line to determine projected enrollments. The 
average percent of increase, or geometric ratio method, 
consists of projection by applying the average annual percent 
increases in enrollment in a grade or group of grades. The 
average numerical increase, or arithmetic ratio, is a similar 
procedure. The average gain in numerical enrollment, rather 
than the average percent of increase, is applied to enrollment 
in successive years in the projection. 

A method of forecasting school enrollment from total 
populations assumes that an observed ratio between total 
population and school enrollment that has existed in the past 
will also exist in the future; the ratio is then applied to an 
estimate of future population (American Association of School 
Administrators, 1947:55; Griffith, 1964:32-33). A similar 
method is described by Strevell (1952:37-38); it allows for 
a projection of the ratio rather than assuming a constant ratio. 
Straight lines fitted by the least squares technique to 
historical ratios of school enrollment to population have been 
used to project United States school enrollment (Simon and Fullam, 
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1968:89) . Methods employing ratios of enrollment to population 
are presented here as projection techniques since the ratio 
is derived from past trends and the population itself is often 
projected from trends. Forecasts employing the ratio to a 
population which is predicted, rather than projected, are 
considered as a separate method; examples are discussed below 
under the names of "housing projection techniques" and the 
"land saturation method. " 

A more refined use of historical data is the "census 
class projection" described by Strevell (1952:35). It employs 
historical percentages of each census class enrolled in 
school, with a census class being defined as a given age group 
in a given year. Historical "migration ratios" are calculated 
by comparing each census class to the class one year younger 
the previous year and by comparing the age six census to births 
six years previous. Averages of these ratios over several 
years of experience are applied to census classes to simulate 
their advancement through the years of the projection. Historical 
percentages of each census class enrolled in school are then used 
to obtain enrollment projections. In this case, it is clear 
that both the enrollment percentages and the base population 
are projected rather than predicted. 
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Probably the most widely used method is the "percentage 
of survival" technique. The procedure involves percentage 
of survival ratios analogous to the migration tatios used 
in the census class projections. However, since the per- 
centage of survival method employs grade enrollments, rather 
than census class counts, it is not necessary to apply per- 
centages of enrollment in s hool. The basic method employs 
birth and enrollment records. The ratio of first grade 
enrollment to resident births six years previous is found 
by averaging the ratio over several years' experience. 

The ratios of the enrollment in grade £ in calendar 
year z to the enrollment in the grade g + 1 in calendar 
year z + 1 are also averaged across several years of exper- 
ience. Enrollments are projected by applying survival ratios 
to present enrollments to obtain projected figures for the 
first year and by applying the ratios to projected figures 
to obtain projection figures for successive years. The basic 
assumption is that the net effects of all factors which 
influence survival rates from birth through grade 12 will 
be the same for the projected period as they were during the 
period of experience used as a basis for projections (Peters, 
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1969 : 1 ). 

The percentage of survival method has several other names: 
''forecasting by analysis" (MacConnell, 1957 : 30 — 31 ), "rentention 
ratio projection" (Strevell, 1952 : 32 - 36 ), "survival rate pro- 
jections" (Griffith, 1964 : 38 ), and "percentage, of rentention" 
(Greenawalt and Mitchell, 1966 : 4 - 5 ). Examples of the use of 
the percentage of survival method include projections made by 
the Massachusetts School Building Assistance commission 
(Greenawalt and Mitchell, 1966 ). An enrollment study of 
Lexington, Massachusetts, used the percentage of survival 
method for short-range projections (Metcalf and Eddy, Inc., 
1968 ); a study of Watertown, Massachusetts, employed percentage 
of survival ratios (Hunt, 1967 ). The method has been used for 
projecting elementary and secondary enrollments for the state 
of Nebraska (Nebraska co-ordinating Council, 1967 ); the Florida 
Department of Education has developed a computerized planning 
system which utilizes the percentage of survival method in 
the enrollment projections (Daniel, 1969 ). Projections of 
school enrollment in the United States as a whole have been 
made by the percentage of survival method (Simon and Fullam, 
1968 : 95 ) . 

Brown ( 1961 ) described a modification of the percentage 
of survival method which he called the "corrected prmotion 
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method. Brown’s method includes an estimate of the number 
of new residents in each grade each year, i^e.., the number 
of children who might be expected from any new homes that 
are built in the school district (p.41 ). The number of 
new residents expected is added to the number of children 
in a grade before application of the survival ratio. Brown 
did not discuss the possibility that this procedure would 
quite likely overcorrect for migration. If the derived 
survival ratios are to any extent based on the arrivals of 
new residents, the additions of predicted new residents before 
application of the ratios would be an overcorrection. 

In the study of Pittsburgh, enrollment projections were 
adjusted after percentage of survival rates were applied? 
additions and subtractions of students in specific grades 
and years were made if a housing development or housing 
removal project were planned for the district (Center for 
Field Studies, 1966). This procedure is subject to the same 
kind of criticism as Brown’s corrected promotion method 
unless it can be assumed that the survival ratios do not 
project trends in housing development or removal. Slight 
adjustments were also made to the survival rates themselves 
to obtain high projections as well as most probable projections. 
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Survival rates were adjusted in some districts for the 
beginning of the projection period; other adjustments were made 
for the second five years of the ten-year projection period. 
However, the report does not contain an explanation for the 
rationale or amount of the adjustments. 

Marshall (1968) used percentage of survival predictions 
adjusted for changes in trends. He made adjustments by 
selecting survival ratios, but he did not explain his method 
of selection. Similar kinds of adjustments were made in a study 
of Hartford; it is again unclear how the amounts of adjustment 
were determined (Center for Field Studies, n.d.). 

A sophisticated technique for modifying U. S. Census 
state population projections to obtain enrollment projections 
for individual counties is the "multiple regression equation 
or cohort-ratio" approach (Jaffe, 1968) . Independent variables 
in the model are school enrollment variables such as enrollments 
in groups of grades in the county; dependent variables are 
county statistics such as resident births, resident deaths, 
retail saies, number of households, income, and registered 
vehicles. All independent and dependent variables are expressed 
as the ratio of the county's share of the total state figure 
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in a given year to its share the following year. Multiple 
regression equations are developed, each relating one 
dependent variable to the independent variables, averaged 
across years for which the ratios are obtained. Dependent 
variables are projected by fitting statistical trend 
functions and extrapolating. The multiple regression 
equations using projected,', dependent variables are solved 
to obtain the projected cohort ratios of the independent 
variables, which in turn are used to obtain projected shares 
of state enrollment which the county enrollment will comprise. 

A method in which the historical trends used in the 
projection are those of another community is "projection 
by analogy" (Griffith, 1964:34). An attempt is made to 
locate a community which has had a growth pattern similar 
to the community under study, but which is now larger in 
population and public school enrollment. Insofar as 
possible, social and economic conditions should be similar. 
Enrollment figures or rates in the comparison community 
are used to predict the future enrollments of the community 
under study. 

Methods of forecasting which are at least in part 
predictive are "housing projection techniques" (Griffith, 
1964:35-36; Strevell, 1952:37) and the "land saturation 
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method" (Peters, 1969:4-5). The predictive aspect of the 
housing projection techniques is the prediction of the 
number of households or dwelling unit types. Enrollments 
are calculated by applying the average enrollment per 
household or dwelling unit type. The average enrollment 
rate is most often based on historical data. After obtaining 
an estimate of future numbers of dwellings and number of children 
per dwelling, a committee making enrollment forecasts for 
Lexington, Massachusetts, applied a percentage enrollment 
figure calculated from historical data (Population and 
Housing committee, 1966). The "land saturation method" 
described by Peters (1969:4-5) is used to predict the 
growth of the population on the basis of anticipated use of 
available land for additional industrial and residential 
buildings; it is not made clear exactly how the population 
figures are translated into enrollment figures. 

A method which is designed specifically for enrollment 
forecasts rather than population forecasts is referred to 
as the "multivariable method" (Johnson: 1965) or presented 
without a specific name (Center for Field Studies: 1953) . 

It can be used with input which is drawn from past trends 
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or that which is anticipatory of future trends, with the 
latter type of input being the most common for the multi- 
variable method. The basic model employs estimates of 
each of the major factors affecting school enrollments and 
applies the factors separately to previous enrollments to 
derive projected enrollments. The exact statement of the 
model varies; a typical model is given by Johnson (1965:186- 
187) : 

A. (Survival births) + (Preschool net migration) + 
(Retentions in grade 1 previous year) - (Non- 
public school enrollment grade 1 this year) ■ 
(Estimated public school enrollment in grade 1) . 

B. (Public school enrollment grade 1 previous year) + 
(Non-public school enrollment grade 1 previous 
year) + (Net migration) - (Non-public school 
enrollment grade 2 this year) - (Retentions 
grade 1 previous year) + (Retentions grade 2 
previous year) - (Dropouts grade 2 previous 
year) = (Estimated public school enrollment in 
grade 2) . 

A report by Arthur D. Little, Inc. (1966) on Quincy, 
Massachusetts, schools included a similar model based 
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on historical data. Johnson (1965) modified the calculated 
historical trends to obtain input for his model. After 
calculating the projected net migration figure for women 
of child-bearing age, he reduced the calculated figure 
somewhat to account for his assumption that the rate of 
population growth was declining slowly in Englewood (p.171). 

He also made the assumption that 1960 fertility rates would 
decline slightly during the next few years (p.172 ), and he 
made "intuitive judgments" about the changing character of 
pre-school migration (p.177). in estimating school-age 
migration, he hypothesized that most of the out-migration 
during 1960-64 had occurred in 1962 and 1963; his "most 
probable" estimate of future school-age net migration reflected 
a smaller loss of school-age population than an average of the 
experiences of the last four years would have projected. He 
made alternative predictions based on different assumptions 
about future private and parochial school enrollment. 

A variation of the multivariable model was used in a 
study of the greater Corning area of New York State (Center 
for Field Studies, 1954). Survival ratios were used, but 
adjustments were made for anticipated changes in the housing 
situation (p. Ap.16). Included in adjustment of past trends 
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for nonpublic school enrollment were the two new rooms planned 
for the parochial schools. Retention and dropout rates were 
projected. 

Another example of the multivariable method in which 
some of the variables were predicted and some projected 
is a study of Arlington, Massachusetts (Center for Field 
Studies, 1953). changes in trends in construction rates 
and available land were considered in predicting migration 
ratios. Projections were made for variables such as dropouts. 

A multivariable study of Salem, Massachusetts (Center 
for Field Studies, 1956) is exemplary in its reporting of 
the considerations involved in adjusting for anticipated 
changes in trends: 

The computation. . .was adjusted to account for 
the opening of the housing pcoject at Rainbow 
Terrace since the in-migration of this period 
cannot be expected to be repeated each succeeding 
five-year period lp.23j. 

A study of the economy of Salem, its housing, 
and its probable future in the area would seem 
to indicate that the average migration of the 
last fifteen years would be more typical of the 
future than the average migration of the last 
five years. During these years there has been 
an unusual out-migration apparently attendant 
upon the closing of the Pequot mill [p. 24] . 

The Diocesan Superintendent reports ho 
knowledge of any plans to expand parochial 
school facilities in Salem at the present time 
[p.26j . 

The Diocesan high school, to be located at 
Peabody, is planned. If this high school is 
built, Salem high school pupils might attend. 
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It is estimated by the Diocesan office that 
possibly 100 boys and girls from grade 8 of 
Catholic schools would attend the freshman 
year at the new high school |p.26]. 

A study of Brockton (Center for Field Studies, 1964) 

is another example of the multivariable method using input 

that goes beyond extrapolation: 

The somewhat arbitrary decision to use a base 
index of 850 for projecting future births is 
justified by the following arguments: 

1. It is consistent with the prediction 
of slowly rising fertility ratios 
during the 1960's. 

2. It seems to parallel the allocated birth 
pattern observed in Brockton since 1960 
[p.A-6]. 

Rather than show the details of this net 
migration estimate, we will explain the key 
assumption which was used. Namely, approxi- 
mately two-thirds of the female net migration 
in Brockton between 1950 and 1960 is likely 
to occur during each five-year period between 
1960 and 1970. The reason for using this 
fraction of two-thirds is two-fold. First, 
two-thirds of the building permits registered 
in Brockton between 1950 and 1960 were issued 
during the last half of the ten-year period. 

Second, there seems to be an indication that 
the issuance of building permits (and hence, 
the net migration figure) will not increase 
much over the 1955-60 level [p.A-7], 

After talking with the principal of each local 
private and parochial school, a projected yearly 
capacity was determined for each building. 

Since the demand for Catholic education appears 
to exceed the current capacity, the projected 
enrollment is simply the estimated capacity of 
these individual schools [p.A-10 ]. 
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Griffin and Schmitt (1966) proposed a model similar to the 
other multivariable models. 

There are many ways in which the estimates for the 
variables in the model may be obtained; some cfthe 
methods may be unique to an individual school system. To 
some extent, other methods of population and enrollment 
forecasting can be incorporated into the multivariable 
method. For instance, projections of housing trends could 
be used in the estimate of pupil migration. Estimates 
may be made by those persons most familiar with each of the 
separate variables or by one person who gathers information 
from many people and sources. General guidelines for fore- 
casting may be found in the literature on school enrollment 
prediction. Scammon (1962:39-41), Brown (1961:13,17, & 41), 
and Greenawalt and Mitchell (1966:24-31) suggested as sources 
the following records, agencies, and persons; U.S. Census 
publications, local school censuses, birth records, marriage 
licenses, building permits, utility company records and 
projections, mail route changes, U. S. Office of Education, 
National Education Association, Chambers of Commerce, 
planning boards, zoning commissions, school enrollment 
reports,, building and occupancy reports, tax records, real 
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estate developers, and government and news publications 
which discuss general business conditions and the activity 
of home and industrial construction industries. 

The multivariable method was chosen as the basis of the 
procedure for producing probabilistic enrollment output. 

The literature on population and school enrollment forecasts 
gives suggestions of criteria for selection of a model. One 
criterion is that the model not restrict the forecaster to 
mere projection of past trends if he feels that there is an 
indication that the trends may take a new direction in the 
future. Evidence is seldom given in enrollment studies for 
the assumption that enrollment trends are stable and are 
likely to remain so; the assumption of continuation of past 
trends seem most often to be an option for simplicity. 

It is this type of assumption which is criticized by many 
demographers (Isard, 1960; Rosenberg, 1968:3) who emphasize 

4 

that such an assumption is often unjustified. Isard (1960) 
said that there is no assurance that a graph or function that 
fits past data will adequately describe the future pattern. 
Gottlieb (1954:68, 110) stated that enrollment forecasts 
must not simply project past trends, but must take into 
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account the probable effects of political, social, and 
economic factors . 

The word "projections" in this study is reserved for 
those studies which project trends for such reasons as 
expediency and simplicity. An assumption of continuation 
of past trends is just as adequate as an assumption of 
change if it is based on the same careful reasoning about 
future developments. This type of reasoning is present 
in a study of Boston: 

Since the basic curve assumes that the influences 
affecting migration remain the same, evidence of 
the reasonableness of this position was sought. 

Project directors of the Boston Redevelopment 
Authority, settlement house directors, and 
welfare agency officials were among those asked 
whether to their knowledge any factor, aside 
from urban renewal, existed which would alter the 
population trend for the health and welfare area 
under discussions. Invariably, the response was 
the same: i ,e . that except for the impact of the 
urban renewal program, public housing, and the 
growth of Negro population, the present popula- 
tion trends would continue fsargent, 1962:A-8j. 

Probable effects of these three factors were examined. 

Any method can be adjusted to account for trend changes; 
the multivariable method, however, seems to be the most 
• adaptable to these adjustments. The other methods dis- 
cussed above involve, to a greater extent than does the 
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multivariable method, summary coefficients or statistical 
relationships among variables. For example, the percentage 
of survival method is based on the assumption that enrollment 
changes can be expressed as the relationship of enrollment 
in one year to enrollment in another year; the trend pro- 
jection techniques and the ratio of enrollment to population 
techniques also assume stability of relationships among 
variables. Two ways of adjusting these types of methods for 
trend changes are to adjust the coefficients or statements of 
relationship or to adjust the enrollment projections themselves 
Brown (1961) attempted the latter; the problem with his 
method, as discussed above, is that he risks incorrect 
adjustment for the migration rate. It is difficult to adjust 
the enrollments aijter a coefficient, equation, or graphic pro- 
jection has been applied. In adjusting for, say, changes in 
migration trends, the summary coefficient gives no indication 
of the migration trend which it tacitly projects and it is, 
therefore, difficult to determine the amount of adjustment 
necessary. To adjust the coefficient itself is equally 
difficult; the contribution that migration rates make to the 
value of the coefficient should be determined. 
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such as birth rate, dropouts, retentions, and nonpublic 
school attendance. 

Analyzing enrollments by separate variables, as in the 
multivariable methods, simplifies adjustment of single 
variables (Center for Field Studies, 1956; Whitla, 1954). 

Of course, the multivariable method does not consider 
separately every possible variable affecting school 
enrollment; for example, it treats migrations as a single 
variable^, whether the migrations are the result of a new 
industry, a housing boom, or a highway. For genera lizability 
and manageability, the multivariable method does group 
variables to some extent; however, the forecaster is encouraged 
to consider separately the various factors in the school 
system which relate to migration and the other variables. 

Another argument can be made in favor of the multi- 
variable approach: it is more adaptable to probabilistic 

modifications of the input. Just as any other method could be 
adjusted in sane way for predictive input, any method could 
be adjusted for probabilistic input. It seems, however, 
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that the forecaster has a more rational basis for placing 
probability limits around estimates for a single variable 
such as migration than around coefficients or other 
statistical summaries. Of course, if he were willing to 
make the necessary assumptions, he could place probability 
limits around a coefficient by analyzing the variance of 
past data. An example is given by Peters (1969). He 
computed not only the means of historical survival ratios, 
but variances as well. Confidence intervals of projected 
enrollments are derived from the use of the variances of the 
survival ratios. 

A third reason for the desirability of easily adjustable 
predictions is the opportunity for precise adjustment of the 
long-range predictions after the short-range predictions for 
the separately estimated variables have been validated by 
experience (Center for Field Studies, 1956:22). 

Another advantage is that an easily adjustable method 
can be used to simulate the results under different assumptions 
about the various variables. For example, the effects of a 
contemplated change in the policy for retaining students can 
be studied. The multivariable model, even without the 
addition of Monte Carlo simulation, can be considered to be 
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a type of simulation. Thus, it has some of the advantages 
of the simulation model, one of which is the possibility of 
simulating the results of a policy decision rather than 
relying on costly trial and error (Malcolm, 1958:57). An 
example of the use of a multivariable model in this way is 
a project undertaken by UNESCO (1966) . 

The model chosen should not only be easily adjustable 
for future trends and probabilistic input and output; it 
should be demonstrated to be demographically sound. However, 
school enrollment forecasting techniques have not been 
rigorously examined. There is a record of only a few studies 
directed toward an evaluation of the various types of fore- 
casting methods and these evaluations have been far from 
satisfactory (Jacob S. Siegel, 1953). What evidence there is 
seems to indicate that all methods are susceptible to 
startling errors under certain conditions. In addition, 
several authorities have despaired over the seeming impossi- 
bility of producing accurate population forecasts for small 
areas. Even national and state population forecasts, con- 
sidered surer than those of small areas, have in the past 
demonstrated gross errors. In 1930, W. S. Thompson and 
P. K. Whelpton predicted that the 1960 population of the 
United States would be between 137.9 million and 167.3 
million; the 1960 census figure was 179.3 million (Greenawalt 
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and Mitchell, 1966:5). 

Greenawalt and Mitchell (1966) examined the accuracy of 
the percentage of survival method using predictions for the 
years 1952 through 1959 made by the Massachusetts School 
Building Assistance Commission for 242 towns and cities in 
Massachusetts. It was assumed that a forecast which pre- 
dicted enrollment within plus or minus 10 percent of the actual 
enrollment was "accurate." Of the 242 predictions studied, 

149 were found to be inaccurate by this definition (p. 8 ) . 

They concluded that the percentage of survival method is most 
likely to be in error in fast growing communities (p.15). 

Experience has shown graphic techniques not to be very 
accurate methods of forecasting (American Association of 
School Administrators, 1947:55). Arithmetic and geometric 
progressions have been shown to be "surprisingly accurate," 

I 

but only because the projections were continually reassessed 
to account for current trends (Center for Field Studies , 
n.d. :41) . 

Larson and Strevell (1952) reviewed 31 enrollment 
forecasts made in twelve states during 1930-1952 by local 
boards of education, private survey firms, taxpayers' 
associations, the U. S. Office of Education, and schools 
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of education and extension services of universities. The 
first two of the three methods reviewed are similar to 
the ratio of enrollment to population methods. In the 
first method, the population is estimated by statistical 
projection of past population. In the second, the population 
is estimated by consideration of community factors such as 
numbers of gas and electric meters, water meters, and 
telephones. It is not clear in the description whether these 
data are to be used for projection or for prediction; in 
the latter case, the method would be similar to the housing 
projection techniques discussed above. The third method 
discussed is similar to the percentage of survival method. 
None of the three techniques were shown to be clearly 
superior in terms of gross error. Although no statistical 
tests were performed, the report contained the median 
percentages of error for forecasts of one to five years; 
the percentages were, respectively, 6.2, 9.9, and 7.7. 
Forecasts made by the first method were more likely to 
be overestimates . 

According to Whit la (1954) , the multivariable method 
is "most accurate" in estimating enrollments. He reported 
a study in which percentage of survival forecasts were not 
as accurate as multivariable forecasts. The study was pre- 
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par ed by the United States Bureau of the Census for fore- 
casting statewide school enrollments. 

The review of the literature on enrollment fore- 
casting techniques seems to justify the choice of the multi- 
variable method for prediction and as the basis of the pro- 
babilistic technique developed in this study. The form of 
the multivariable model used in the present study for grades 
0-12 is diagrammed in Figure 2.1 (p.38). The form of the 
model for the other grades is similar. The forecaster places 
probability limits around the separate variables in the 
model. Specifically, the forecaster is asked to give each 
variable a high estimate, a most likely estimate, and a low 
estimate, with the high and low estimates representing the 
limits of the 98 percent confidence interval. The problem 
is accounting for this probabilistic input in the prediction 
output. 

Similar problems have been conceived in terms of 
Markov chains. The model would have to be modified some- 
what to be considered in these terms; for example, migration 
figures would have to be considered in terms of probabilities 
that a student in a certain category would migrate. A finite 
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Fig. 2.1. Multivariable model used in the present 
study. 
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Mar kov process, or Markov chain, is a multistage stochastic 
process such that the probability of a process being in any 
one of a finite number of states at time t + 1 is conditional 
on the state the process is in at time jt and the matrix of 
probabilities of moving among the states (Burford, 1966:5; 
Bartos, 1967:31). In terms of the multivariable model, 
the states might be categories such as first grade student, 
second grade student, dropout, or deceased. There are problems, 
however, in conceptualizing the problem in terms of Markov 
chains. One of the basic assumptions of Markov chains is that 
the transition probabilities would be the same for each year 
of the prediction. A second restriction is that it must be 
possible to obtain estimates of probabilities of moving from 
any given state in one period to any other state in the next 
period. It is difficult to express births and in-migrations as 
a percentage of a set of possible in-migrations and births. 

The logical method of representing in— migrations would be 
to have a state called "world" (Bartos, 1967:135), but to 
estimate in-migrations in a school system as a percentage 
of the world population or even of the United States or a 
single state's population is infeasible. 

An example of a computerized application of a modified 
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Markov chain to school populations is Dynamod II (Zabrowski, 
1969). The assumption of stable transition probabilities 
in this situation is not serious because the purpose of the 
model is not to predict new trends in school enrollment, 
but to provide educational planners with information on the 
impact on educational populations of proposed policy changes 
or of sudden shifts in the structure of the ec cational 
system, by varying such factors as dropout rate or teacher- 
pupil ratio. Migrations were not included in the model since 
the study involved an analysis of national enrollment figures, 
but including births necessitated modification of the Markov 
chain model. The numbers of people within the system moving 
among states were calculated by transition probabilities, and 
estimated births were simply added to categories when applicable. 
Thus, Zabrowski circumvented the problem of transition 
probabilities involving the state representing birth. 

A similar study was done in Norway (Thonstad : 1967) 
and was based on transitional probabilities for such states 
as grades, schools, deaths, dropouts, and graduates for the 
purpose of determining the educational distribution of the 
population to which the present propensities were leading. 
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But again in this use of Markov chains, predictions of 
changing enrollment trends were not the purpose of the 
study. Thus, Markov chains have not been applied in these 
instances to essentially predictive problems and the under- 
lying assumptions of Markov chains limit their usefulness 
for solving the problem of calculating probabilities for 
predicted enrollments. 

Before considering Monte Carlo simulation as a solution, 
it is necessary to consider the possibility of straight- 
forward mathematical solution of the model as diagrammed 
above (Figure 2.1). ’’Every Monte Carlo computation that 
leads to quantitative results may be regarded as estimating 
the value of a multiple integral j^Hammersley and Handscomb, 
1964:50 3 •" The multivariable model as modified in the 
present study to accommodate probabilities is built with 
the assumption that the probability distributions around 
the separate variables can be considered to be normal. The 
model is similar to the linear combination of n independent 
random variables which are normally distributed. Thus, 
for normally distributed variables in the model 

Y = CjXi + C 2 X 2 + ... C n X n , 
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the distribution of Y is normal and has a variance given by 
2 2 2 2 2 2 2 
CT y = c ]Pi+ C 2 cr 2 + C n^*n 

(Hays, 1963:234, 236). For a normal distribution of 

enrollments with known variance, it would be a simple task 

to express the various probabilities. However, the present 

model does not fit the pattern exactly; the model is not 

simply a linear combination of the variables. As shown in 

Figure 2.1, some of the variables are multiplied; for 

example, the model includes the product of rentention rate 

and enrollment the previous year. Thus, the variances of 

the variables cannot be added to determine the variance 

of enrollments, and the final distribution of enrollments 

is not necessarily normal. The model could be adjusted by 

calculating the variance of the products and then calculating 

the linear combination of single variables and products of 

% 

variables. To calculate the variance of a product of 
variables X and Y, the following formula is used: 

2 22 22 22 22 
& (XY) = C XCY + Y 0*(Y) + X O’ (X) -XV, 

where X is the mean of X and Y is the mean of Y. (derived from 
Kendall & Stuart, 1963:232-233). 

However, the use of the above adjustment assumes that 
all of the variables are independent. This assumption is 
violated in that previous enrollment is one of the single 
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variables and it also appears in one or more of the products 
of variables. For example, the product of a retention rate 
of .10 and a previous enrollment of 1964 is subtracted from 
the previous enrollment of 1964; in this case, the two 
enrollment variables are obviously non-independent. 

A further complication is that it is more difficult 
to assume a normal distribution of calculated enrollments. 
Without an assumption of normality, it is more difficult 
to derive probabilities from knowledge of the variance. 

Monte Carlo simulation is proposed here as a satis- 
factory way to combine the probabilistic input to obtain 
the probabilistic output. A computer simulation model is 
a logical-mathematical representation of a concept, system, 
or operation programmed for solution on an electtonic computer 
(Martin, 1968:5). Solving a problem by the Monte Carlo 
method amounts to submitting the problem to a roulette wheel 
(McCracken, 1955). Extending this analogy, the comparttnents 
of the wheel are labeled in such a way as to reflect the 
probability distribution. For instance, if a student had a 
95 percent chance of passing his course-work, one out of 20 
compartments would be designated "fail"; the others would be 

m 

designated "pass." A spin of the roulette wheel would 
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produce the "iteration-specific" value of the variable. 

Spinning the wheel again and again to produce more iterations 
would result in a distribution of iteration-specific values. 

In this case, it would be a distribution of values with 
approximately 95 percent of the values representing "pass." 

The distribution of the outcomes in this problem is easily 
predicted. The solution to the enrollment problem proposed 
here is analogous to spinning the wheel several times for 
each iteration, with one spin for each variable contributing 
to the enrollment prediction. In this type of problem, the 
outcome is not so obvious. 

In a Monte Carlo solution, one or more of the variables 
depend upon chance parameters whose values are randomly 
selected from probability distributions. Because of this 
random feature, the outcomes in a Monte carlo solution 
usually differ for repeated runs with the same input values; 
to produce statistically significant results, replications 
are required with the same inputs (Martin, 1968:33). In 
summary, Monte Carlo simulation involves random sampling 
from probability distributions to determine iteration-specific 
outcomes; the results of a number of iterations form the solution 

\ 

of the problem. 
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Although the mathematical techniques used in the Monte 
Carlo simulation had been used before, particularly in 
deterministic problems, the name and popularization of the 
technique dates from 1944 (Hammers ley and Hands comb, 1964: 

6). Research on the atomic bomb during the Second World 
War involved a study of random neutron diffusion in fissile 
material. The scientists had basic data such as the average 
distance a neutron of a given speed would travel before 
collision with an atomic nucleus, the probability that 
the collision would result in a neutron bouncing off rather 
than being absorbed, and the amount of energy the neutron 
was likely to lose in the collision. However, it was 
impossible to sum all of these probabilities in a mathe- 
matical formula. John von Neumann and Stanislas Ulam 
suggested a solution which was given the code name "Monte 

Carlo" (McCracken, 1955 : 90) . 

The Monte Carlo approach to the solution of such a 
problem consists of pretending to trace the life histones 
of a large number of neutrons. At each decision point in 
the history of each neutron, a random number is selected to 
determine which outcome occurs, in keeping with the 
known probabilities of occurrence. Through the accumulation 
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of a large number of such histories, it is possible to 
estimate the percentage of neutrons which will terminate 
in each of the final possible outcomes (Hammers ley and 
Hands comb, 1964:6). 

Monte Carlo simulation can be used to solve both 
deterministic and probabilistic problems (Meyer, 1954:2). 
Problems are classified as deterministic or probabilistic 
depending on whether they are concerned directly with random 
processes or involve the assumption of a random process 
for the purpose of obtaining an approximate solution to the 
problem. The study of the behavior of neutrons was essentially 
probabilistic. An example of a deterministic problem is the 
approximation of area under a curve, not by calculus, but by 
recording the percentage of occasions on which a pin thrown 
randomly onto a graph falls inside, rather than outside, the 
curve . 

The present Monte Carlo simulation is analogous to the 
neutron behavior study in that the enrollments are considered 
to be stochastic, as were the neutron behaviors. A random 
number is used to determine iteration-specific enrollment 
variables, the combination of which represents an iteration- 
specific predicted enrollment. The model differs from 
stochastic Monte Carlo models such as the neutron simulation 
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in that the factors assumed to be probabilistic are the 
values of the variables such as migration rather than 
the behaviors of individual students or neutrons. The 
proposed use of Monte Carlo for predicting school 
enrollments is more closely analogous to problems in operations 
analysis in which the random variables may be, for example, 
the estimated time needed to accomplish certain tasks. 
McCracken (1955) discussed the use of Monte Carlo in a wood- 
working shop problem to determine how the work of the shop 
should be scheduled to yield the greatest production, con- 
sidering a number of variable conditions. If one knows that 
a certain job would take from 12 to 16 minutes, respectively, 
he could simulate the operation by random sampling. In the 
school enrollment problem, the values of variables such as 
migration are analogous to the time variables in the shop 
problem. 

According to Malcolm (1958:57), simulation has the 
advantage of being easily understood because it is relatively 
free of complicated mathematics. Also, a mathematical 
solution may be unavailable or too complex to apply. Orcutt 
(1962:101) stated that Monte Carlo simulation enables the 
user t* introduce into his model interactions, variables, 
non-linearities, and stochastic considerations that he might 
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not be able to introduce using a mathematical method. 

Monte Carlo simulation has been applied to problems 
in demography. Sheps applied Monte Carlo techniques to 
construct stochastic micro-models of demographic behavior 
(Demeny, 1964:70). Part of Sheps' model was described by 
Beshers (1964) : 

She puts each individual women of the cohort 
stochasically through a number of contingencies, 
year by year. First, she ascertains how long 
each women is likely to live; when she is likely 
to marry for the first time; and when she is going 
to become sterile. Once a women is married and 
fertile, the program ascertains, by the same kind 
of stochastic process, whether or not she is a 
family planner. Then her history through child- 
bearing is determined from her age, parity, and 
the other contingencies that occur [p. 72j . 

A team led by Orcutt (1961:285-350) conducted a 

socio-economi c simulation of the United States using Monte Carlo 

simulation. The simulation resulted in a distribution of 

people among various demographic categories. 

Other kinds of simulation have been used to study 

school enrollments. Members of the UNESCO staff developed 

a system simulation for Asian countries (1966) . Mathematical 

models for enrollment projection which do not include Monte 

Carlo simulation have been developed for Sweden (Forecasting 

Institute, 1967) and Great Britain (Armitage and Smith, 1967) . 
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The potential value of Monte Carlo in school enrollment 

studies has been recognized, but not yet applied. Armitage 

and Smith (1967) outlined the reasoning: 

Even at this stage, our model calculations will still 
be purely deterministic, i.e. everything happens 
exactly as specified by the equations. . . .If we are 
to make some assessment of the value of deterministic 
calculations, then it is convenient to regard the 
transition proportions as probabilities. . . .The 
model will then consist of simultaneous multinomial 
distributions and while the explicit treatment of such 
a system would be intractable, it will be possible to 
carry out Monte Carlo simulation calculations in 
which the values of all (variables) ... are found by 
sampling from multinomial distributions. The intro- 
duction of random variation in this way is one line 
of attack upon the 'noise' problem, i.e. the dis- 
tortion of the signal of the deterministic calculation 
by the presence of variation, the fact that things do 
not happen exactly as expected. The deterministic 
calculation can still be regarded in the traditional 
way as providing 'the best estimate' of what is 
expected to happen? but these simulation calculations 
will give some indication of how far we are justified 
in placing our faith in these 'best estimates' and 
acting upon them £pp. 184-185] . 

Griffin and Schmitt (1966) detailed a model for using 
Monte Carlo simulation in enrollment predictions. However, 
the actual simulation was not performed. 

The present study includes the performance of the simulation, 
as well as basic changes in the Grif f in-Schmitt model and tests 
of validity and reliability not included in their design. 
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Chapter III 
Design of the Study 

The major purpose of the study was to develop a satis- 
factory method of enrollment forecasting to produce multiple 
figure predictions with associated probabilities. A review 
of the literature indicated that a Monte Carlo simulation 
using the multivariable method as a basis for the simulation 
might well be a satisfactory solution. Thus the first major 
objective of the study was to develop a system of computer 
programs for performing Monte Carlo simulations employing the 
multivariable model. The first objective involved the following 
steps : 

(a) collection of data on which to test the 
simulation, 

(b) adaptation of the multivariable model to 
the present purposes, 

(c) writing the computer programs for the 
simulation, incorporating previously 
written programs to generate random numbers, 

(d) performing the simulation using the sample data, and 

(e) preparing instructions for future users of the 
simulation method. 
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The second major objective was to investigate the predictive 
validity, reliability, and concurrent validity of the method 
using statistical tests with sample data. 

Data needs for the present study were somewhat unique in 
that the required data were not measures on variables for various 
populations, but predictions of school enrollments. 

Historical predictions, that is, predictions which were 
made in some previous year, were needed for the test of pre- 
dictive validity. The reason, of course, is that historical 
predictions can be tested for their accuracy of prediction 
for the intervening years. 

The data chosen to represent the multivariable model were 
the basis of an enrollment prediction study for Brockton, 
Massachusetts, made by the Center For Field Studies (1964: 
Appendix) . The data had to be modified somewhat; the most 
significant modification of the data made by the present 
investigator was the addition of high and low estimates for 
the variables, which were given only one estimated value by 
the Center for Field Studies. The data from this particular 
enrollment study were chosen after an extensive search of the 
literature and over thirty interviews with educational con- 
sultants, university personnel, state department of education 
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personnel, school administrators, city planners, and 
school committee members. The search revealed that the 
multivariable method was not among the most common methods 
presently used. However, the method was used in several studies 
by the Center for Field Studies of Harvard University Graduate 
School of Education. One of the studies, that of Brockton, 
Massachusetts, seemed to be most adaptable for the present 
purposes; an example of the multivariable method with "high" 
and "low" estimates for the variables was not available. 

Brockton is a city in southeastern Massachusetts, twenty 
miles from Boston, with a 1965 population of 83 , 499 . During 
the decade of 1955 to 1965 , the population of Brockton 
increased by 20 , 871 , with an estimated excess of births over 
deaths of 8,167 and an estimated net in-migration of 12,704 
persons. Brockton is an industrial city with shoe manufacturing 
the predominant industry. The median income of Brockton families 
in 1960 was $ 5 , 914 , somewhat below that for the state as a whole 
(Massachusetts Department of Commerce and Development, 1967 ) . 

The Brockton data provided predictions for each of the 
variables and for total enrollments for grades 1 through 12 for 
the academic years 1964-1965 through 1975 - 1976 . Since the 
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Brockton study utilized fall enrollment figures and the 
computer programs developed in this study were designed 
to predict fall enrollment, a prediction for the academic 
year 1964-1965 is considered a prediction for the year 
1964. 

Besides the addition of high and low estimates for 
variables, the data provided in the Brockton enrollment 
study had to be modified to fit the specifications of the 
particular multivariable model developed for the present study. 
In Figure 3.1 (page 54) is the model used by the Center for 
Field Studies to predict grade 8 enrollment; in Figure 3.2 
(page 55) is the model used in the present study. Accompanying 
both models are the data for predicting grade 8 enrollment 
in 1964. Variations of the latter model for predicting 
other grades include use of birth data in predicting grade 1 
and omission of the dropout variable below grade 8; the model 
for grades 9-12 is identical to the model for grade 8. The 
model developed by the Center for Field Studies is referred 
to as the "Center" model and the one modified by the present 
investigator is the "modified" model. 
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Fig. 3.1. Multivariate model for ©rade 8 used by the Center 
for Field Studies. ( In parentheses are the figures used to 
predict 1964 enrollment.) 
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Fig. 3.2. Multivariable model for grade 8 used in the 
present study. (In parentheses are the figures for 
predicting 1964 enrollment without simulation.) 
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The modified model includes the dropout variable in 
predictions for grades 8-12; the Center model included it for 
grades 9-12 only. The dropout variable for grade 8 was 
used in several of the enrollment studies reviewed; in 
order to make the model as generalizable as possible, the 
variable was included in grade 8 of the modified model. 

The modified model also includes rate of death and 

V 

\ 

institutionalization as a variable at each grade level as 
recommended by Collins and Langston (1961:10). The Center 
model included deaths for only the first year of age. The 
modified model uses a variable for preschool deaths, which 
includes deaths during the first year. If a user wishes to 
estimate first year deaths only, as in the Center model, the 
preschool death variable may be used to record an estimate for 
first year deaths, and all values of the variable for school- 
age death or institutionalization may be set at zero. 

It has been recommended (Center for Field Studies, 1954: 
App. 13) that age-specific birth rates be used when estimating 
future birth rates. Age-specific birth rates provide for the 
fact that women of certain ages have more children that do 
women of other ages; estimates of birth rate by this method take 
into account shifts in the age distribution of the female 
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population and birth rate for each of six age groups. The 
age groups used for age-specific birth rates are 15-19, 

20-24, 25-29, 30-34, 35-39, and 40-44. The reports show a 
few births occurring before or after these ages. However, 
the numbers involved are too small to give significant results, 
and it is probable that some of these are due to mistakes 
regarding age (Whelpton, 1954:28). Births in Brockton were 
estimated by age groups by the Center's staff, but the 
Center model itself requires only that total births be 
estimated. 

Similarly, one assumption on which the modified model 
i" based is that the variables concerning first year deaths, 
retentions, student deaths and institutionalization, and 
dropouts, can best be predicted as proportions, rather than 
numbers, since the size of these estimates seems closely 
related to the size of the base population. The modified model 
requires estimates of these proportions; the Center model 
requires only estimates of the numbers of students involved, 
although these numbers were obtained in the Brockton study 
by projections of past trends in the proportions of students 
involved. 
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Both models have an advantage over some prediction models 
in that they require that migration and nonpublic school 
enrollment be estimated as numbers, rather than as percentages. 
An advantage of the multivariable method in general is that 
it does not confound variables to the extent that they cannot be 
accurately predicted? it would be unnecessary compounding of 
variables to estimate migration or nonpublic school enrollment 
as a percentage of, say, total school population (Center for 
Field Studies, 1954: App. 9). The assumption is that demo- 
graphic data such as housing data are more accurately used to 
estimate the numbers of migrants than to estimate the percentage 
of migrants in total school population, which has other sources 
of variation in addition to migration. 

Two variables in the Center model, private or parobhial 
school enrollment for the previous year and private or 
parochial enrollment for the predicted year, are combined in 
the modified model to form one variable: net transfers to/from 

nonpublic schools. Besides simplifying the model, the com- 
bination makes it possible to collect much of the data necessary 
for the prediction from public school records. The nonpublic 
school variable can be adjusted in cases where there are 
children residing in the district attending school outside 
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the district and out-of-district children attending public 

schools in the district under study. 

) 

i 

Another slight difference between the Center model and 
the modified model is that all net migrations after grade 1 in 
the modified model refer to in- and out-migration of public 
school students. The Center model allows for the estimation 
of total migration in the district for each grade, including 
migration of nonpublic school students. The reasoning behind 
the modification was to simplify the task and to make as much 
of the data as possible available from public school records. 
Some schools, of course, would have to begin keeping the 
necessary records if this prediction method were to be used. 

Some interpolation of the Center data was necessary to 
obtain predictions of the number of women in each age group 
and the age-specific birth rate for each year of the simulation. 
The Center data provided estimated female population for the 
years 1965 and 1970 only? births were calculated for these 
years by applying age-specific birth rates, and the predicted 
births for the remaining years were obtained by interpolation. 
Since the modified model requires age-specific birth rates 
and migrations of women for each year, these variables had to 
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be interpolated from the 1965 and 1970 values. These inter- 
polations were adjusted, as necessary, to coincide with the 
births as interpolated directly. 

The Center data did not include a breakdown by sex for the 
predictions, but the proportion male for each estimated 
variable was needed to test the functioning of the part of the 
computer program designed to produce sex breakdown. Since no 
tests of validity for the sex breakdown were performed, a some- 
what arbitrary estimate of the proportion male was satisfactory. 
The proportion male chosen for each of the variables was .516, 
the proportion male in the 1963 total public school enrollment 
(Brockton, Massachusetts, School Department, 1963). 

A major revision of the Center data necessitated a re- 
calculation of the predictions for the modified model. The year 
1963 was used as the base year; that is, the historical enroll- 
ments by grade in 1963 were adjusted to become the predicted 
enrollments for 1964, the first year to be simulated. The 
historical enrollments recorded for 1963 in the Center's 
study were not the same as the 1963 enrollments recorded in 
the Annual Report of the School Department of Brockton (1964) . 
Since historical data from the Annual Reports were used to 
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calculate the percentage of survival projections for use in 
the comparison of the predictive validity of the multivariable 
method with that of the percentage of survival method, con- 
sistency in the 1963 figures was necessary. Howard Johnson 
(1969), formerly with the Center for Field Studies, explained 
that the enrollments had been adjusted for factors such as 
special education students and homebound students. Since he 
was not able to state the exact means of adjustment, the 
decision was made to use the figures from the Annual Report 
for all 1963 enrollments. 

Since the required high and low estimates were not made in 
the Center's study, these were set somewhat arbitrarily by the 
present investigator. However, guidance for these estimates 
was available; recent historical data for most of the variables 
were given in the Center's report. 

The criteria for choosing high and low estimates for each 
of the variables are based on the assumption that the predicted 
Values of the variables represent a probability distribution 
in the form of a beta distribution. A hypothetical beta 
distribution was used in a similar situation requiring estima- 
tion of variables, the PERT system. PERT (Program Evaluation 
and Review Technique) was developed for use by the U.S. Navy 
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Special Projects Department for planning and controlling 
the development of the Polaris submarine weapon system 
(Lambourne, 1967:42). 

The decision to use the beta distribution in the present 
problem was based on the precedent set by the PERT system 
and the similarities between the PERT system and the present 
problem. The PERT system requires the user to estimate the 
"optimistic," "pessimistic," and "most likely" time that it 
will take to complete various activities; this is analogous 
to the estimation of "high," "low," and "most likely" figures 
in the present study. PERT involves the estimation of time 
required to achieve events together with an estimation of the 
uncertainties involved; the system is based on human judgment 
about events and times (U.S. Department of the Navy, 1958:1). 
Estimates for times to complete various "activities" are 
combined to estimate both the time needed to complete an 
"event" and the associated variance. The activities are 
analogous to the variables affecting enrollments; the events 
are analogous to the enrollments. The PERT model, however, 
avoids one of the complications of the present model: the 
activities are in linear combination, assuring that the 
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dis tributions of event times will be normal under the central 
limit theorem (Hays, 1963:242). 

Assumptions were made in the PERT model about the general 
characteristics of the probability distribution of the time 



involved in performing an activity: 



It is felt that the distribution will have 
but one peak, and that this peak is the 
most likely time for completion. Thus, 
the point m [see Figure 3.3 (p.64)3is 
representative of the most probable time. 
Similarly, it is assumed that there is 
relatively little chance that either the 
optimistic or pessimistic estimates will 
be realized. Hence, small probabilities 
are associated with the points a^ and b. 

No assumption is made about the position 
of the point m relative to and b. It 
is free to take any position between the 
two extremes — depending entirely on the 
estimator's judgment [u.S. Dept, of the 
Navy, 1958: 4^] . 



The Beta distribution is usually in the following form: 




c (^ o --/x * 1 

[Kendall and Buckland, 



; o.P > o 

1957:26] . 



It was assumed that variables in the present model exhibit 



these same distribution properties. For instance, the 
beta distribution is unimodal. However, the distribution 



of a variable might not be unimodal when the variable is 
influenced by the probabilities of a discrete major event. 
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Fig. 3.3. A beta distribution. 
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such as the closing of a parochial school. Estimated transfers 

from nonpublic schools might vary around a low figure df the 

parochial school remained open and around a high figure if 

it were closed. Instead of introducing bimodal distributions 

into the model, it is possible to view the situation as two 

unimodal distributions with probabilities £ and 1-p of being 

chosen. Thus the computer program could be modified, if 

necessary, to use a two step random procedure: first, drawing 

a distribution and second, drawing a value from the randomly 

chosen distribution. The beta distribution also assumes 

continuous values (Martin, 1968:72) and, of course, predictions 

of enrollment deal with discrete values. This is not considered 

a significant drawback, however; enrollments are considered 

to be continuous until the output stage of the program at which 

time they are truncated to integers. 

Lambourne (1967) described the theoretical meaning of 

the optimistic, pessimistic, and most likely times: 

Suppose an individual activity were repeated 
under identical conditions a hundred times... 
and we were to plot a graph of the statistical 
distribution of the achieved times ... .We call 
the one shortest duration out of the hundred 
the Optimistic time, the one longest the 
Pessimistic, and the time corresponding to 
the high point, the Most Likely fp.43]. 
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For the PERT model it was desirable to use as the expected 
values of activity times a value other than the most likely 
estimate, since obtained estimates showed the distribution to 
be skewed in many cases, usually with the likely time nearer 
the optimistic than the pessimistic time. An estimate of 
the expected value was made based on the probability density 
of the distribution and sample values. The estimate for the 
expected value of an activity time (E (t) ) is as follows: 

E ( t ) = (a + 4M + b)/6 

£u.S. Department of the Navy, 1958: App.B(2), B(4)3 • 

(See Figure 3.3 (p.64).) 

For unimodal frequency distributions, the standard 
deviation can be estimated roughly as one-sixth of the range. 
Using the points a_ and b to represent the range, the estimate 
of the standard deviation becomes one-sixth of the difference 
between the pessimistic and optimistic time estimates: 

SD = (b - a)/6 

In the present use of the beta distribution, it was 
necessary to calculate the mean and standard deviation since 
each beta distribution is transformed into a normal dis- 
tribution with the same mean and standard deviation. The 
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purpose of the transformation was to simplify the simulation 
process; it is much simpler to draw a random number from a 
normal distribution than from a beta distribution. It was 
assumed that a better estimate of the probability distribution 
is obtained if the user is allowed to describe a beta dis- 
tribution which is transformed than if he is required to 
describe a normal distribution. The mean and standard 
deviation are estimated by the formulas used in the PERT 
mode 1 . 

High and low estimates were set for the input from the 
Brockton study by utilizing the historical data which were 
given for most of the variables. In most cases, the data for 
the past five years were given. There was some variation in 
the pattern; for instance, for the female population variable, 
data for the past eleven years v;ere used. The available data 
were used to calculate the variance across years; this 
variance was used to estimate the variance across hypothetical 
trials of the enrollment outcomes. The variance for the 
preschool net migration variable had to be determined arbi- 
trarily, since adequate data were not given in the report. 

The variance was used to obtain nonskewed distributions by 
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setting a and b each three standard deviations away from m. 
Although a and b theoretically enclose 98 per cent of the 
cases, the choice of three standard deviations was made 
since the difference between .a and b is used as an estimate 
of six standard deviations. 

A second set of limits was chosen to represent "skewed" 
limits. On one side of the most likely estimate, a point 
three standard deviations from m was chosen; on the other 
side, a point five standard deviations from m was chosen. A 
set of skewed limits was chosen so that a kind of concurrent 
validity of the model could be examined. Without simulation, 
the "most likely" enrollment estimate would normally be used 
in planning; with simulation output in the form described 
(see Ch. I, p.8, supra . ) , the .50 probability figure would 
often be taken as the equivalent in importance for planning. 

To the extent that the original beta distribution of estimates 
is skewed, however, the .50 probability figure in the simulation 
and the "most likely" enrollment computed by the multi- 
variable method without simulation may not coincide. It is 
assumed that the estimates in the multivariable input represent 
the mode. A quotation from the Center study shows that the 
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estimates do not seem to represent the mean or the median: 
"Should the recent drop in the number of building permits 
be suddenly reversed, this migration figure will probably 
be too small': [center for Field Studies, 1964: A-9j . " If 
the migration figure were intended to be an estimate of the 
mean or the median, it would seem that it should be somewhat 
higher to take account of the possibility that the trend 
in building permits would be reversed. If the planner 
wants an estimate of the mean or median, this distortion 
of the most likely estimate might be desirable. However, 
if he actually wants an estimate of the mode, the necessary 
information may be lost in the transformation from the beta 
to the normal. The direction of the skew for each variable 
was chosen so that the skew would be in the direction of 
larger total enrollment. Thus the "errors" would not com- 
pensate each other and the full impact of the skewing would 
appear in the total enrollment figures. 

The addition of simulation to the multivariable process 
provides for variation among the variables. In Figure 3.2 
(p.55), the 1963 grade 7 enrollment is adjusted to become the 
predicted grade 8 enrollment for 1964. For the simulation of 
grade 8 for 1964, net migrations, transfers to/from the 
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nonpublic schools, retentions, deaths, and dropouts are con- 
sidered to be drawn from distributions with properties 
determined by the high, most likely, and low estimates. In 
the simulation, the multivariable method is performed with 
values for some or all of its variables being randomly drawn 
from such distributions. Since the outcome depends in part 
on random numbers, repetition of the process is necessary 
to determine statistical properties; one hundred iterations 
were performed for each grade and year of the simulation. 
After the first year of the simulation, enrollments for 
previous years and grades are no longer single figures, but 
are predictions which vary across iterations and contribute 
to the variance of future enrollments. 

One main computer program and a subroutine for output 
were developed for the present study. The computer programs, 
MAIN and OUTPUT, were written in Fortran IV user language. 
MAIN also employs the subroutine GAUSS supplied by the IBM 
System/360 Scientific Subroutine Package (1968:77). 

GAUSS computes a normally distributed random number from 
a distribution with a specified mean and standard deviation. 
TVelve uniform random numbers are used to compute normal 
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random numbers by the central limit theorem. The result is 
then adjusted to conform to the given mean and standard 
deviation. GUASS uses subroutine RANDU, also included in the 
Scientific Subroutine Package, to obtain uniform random 
numbers which are found by the power residue method. The 
random numbers generated by RANDU are actually "pseudorandom" 
numbers generated through a mathematical process. In computer 
applications, generation of pseudorandom numbers is preferable 
to storing a lengthy table in computer memory (Martin, 1968: 
77). RANDU generates 2**29, or 536,870,912, terms before 
repeating the cycle (International Business Machines 
Corporation, 1968:77). The number of times RANDU is called 
in the simulation is sufficiently smaller than the number of 
terms in thf- cycle. 

The first steps of MAIN give directions for reading the 
input data. In addition to historical data for the base years 
and estimates of variables for predicted years, the input data 
include information on the user options: the number of years 

£15 to be simulated, indication of whether or not kindergarten 
enrollments are to be predicted, indication of whether or not 
enrollments are to be calculated separately by sex, and the 
random number chosen to initialize the random number generator. 
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A print-out of the input data is produced as a check on the 
accuracy of the input. Next, means and standard deviations 
of the input variables are computed using approximation 
formulas (U.S. Department of the Navy, 1958: App. B(3), 

B (4) ) . 

The random number generator, GAUSS, is initialized by a 
randomly drawn nine digit odd integer available in the input 
data. The initialization is optional for GAUSS, which uses its 
own starting value when no value is chosen by the user. In 
order to perform a test for reliability in the present study, 
it was necessary to control the starting values so that the 
simulation can be performed using two different randomly 
chosen starting values. The two starting values, 420013363 
and 083632427, were drawn from a random number table (Hodgman, 
1959:238-239). The program could be modified to allow a 
user to omit the initialization step. 

Before the random number generator is used to draw values 
for enrollment variables, it is used to generate one thousand 
"throw-away" random numbers. This procedure was intended to 
avoid bias which might be present in the first few numbers 
generated. 

The body of the program MAIN is enclosed in two major 
DO loops. The outer DO loop varies the year of simulation 
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from one to the number 5 15 specified by the user. The inner 
DO loop varies the grade level of the predicted enrollment 
from one to 12 or 13, depending on whether or not predictions 
for kindergarten are included. 

The exact pattern of calculations for making predictions 
varies with the grade and year of the simulation. Within the 
two major DO loops, appropriate IF statements transfer control 
to the series of statements which calculates iterations for the 
particular grade and year. These series are actually DO loops 
which calculate the one hundred iterations for the specific 
grade and year; these are referred to as the four iteration 
DO loops. The first iteration DO loop is used in the calculations 
for grade K or 1 for the first five years of the simulation, or, 
if kindergarten enrollment is predicted, for the first four 
years of the simulation. The second DO loop also calculates 
grade K or 1 enrollment, but for the years of simulation not 
calculated by the first DO loop. Two slightly different methods 
are needed in this calculation because historical birth data 
ale available for the first few years and must be predicted for 
the other years. During trial runs of the simulation, the second 
DO loop included a print-out of predicted births as a check on 
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the routine for calculating births from age-specific birth 
rates and numbers of women. 

The third DO loop calculates enrollments for all grades 
above the first grade level to be predicted and for the first 
simulation year. The fourth DO loop calculates enrollment 
for these same grades for the remaining simulation years. 

The method of the third DO loop differs from that of the 
fourth loop in that the third uses historical enrollment 
whenever previous enrollment is required in the calculations ; 
the fourth uses iteration-specific predicted values. 

Since an iteration-specific prediction value of enrollment 
in one grade is used to predict the iteration-specific enroll- 
.nrent in the next grade and year, perhaps a more straightforward 
method of programming would compute predictions for all grades 
and years for one iteration before continuing to the next 
iteration. However, this would have caused complications 
in terms of computer memory space. All one hundred iterations 
for a year and grade unit must be retrieved at the same time 
so that their distribution may be described. However, it is 
not necessary to have the iterations for all grades and years 
stored simultaneously; only the iterations for the presently 
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simulated year and the previous year are necessary. The 
subroutine OUTPUT is called at the end of each iteration 
DO loop; however, provisions are made for storing iteration 
information for the present and the previous year. 

Another instance in which information used in the pre- 
diction of one grade is stored to be used in the prediction 
of the next grade is the treatment of randomly drawn retention 
rates. Since the retention rate in grade £ is used in pre- 
dicting both grades £ and g+1 for the following year, the randomly 
drawn rate is stored after use in predicting £ and used again 
to predict g+1 . This avoids the spurious variance which would 
be introduced by randomly drawing another retention rate. 

It should be noted that some imprecisions could be 
introduced in the first grade level predicted since estimates 
for preschool deaths are supplied to simple birth estimates 
rather than to birth estimates adjusted for preschool migration. 
Thus the preschool migration figures are not adjusted for death 
rates unless the preschool death rate estimate is modified to 
account for this; such refinement of the prediction, however, 
is not considered necessary. 

Since the program was designed to be generalizable to 
various enrollment prediction situations, and since some users 
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may want to have output by sex, the option for breakdown by 
sex was included. The user has an option to give an estimated 
proportion male, as well as the three other estimates for each 
variable. The estimate for the proportion male is given as a 
single figure, varying only across variables; no high or low 
estimates are given for the estimate of the proportion in 
order to keep programming and input requirements as simple 
as possible. Of course, this introduces some bias toward 
underestimation of the variance for predictions. 

An assumption on which the simulation was built is that 
of independence among the estimated variables. Random numbers 
from the distributions of these variables are drawn independently. 
In some cases, this assumption may be false, since variables 
such as retentions and dropouts, or births and migrations, 
may be related. But the development of a correction for this 
possibility would be more than is warranted by the exploratory 
nature of the present study. Variables used more than once 
in the computations, such as previous enrollments, are obviously 
not independent and are treated accordingly; the same value for 
the previous enrollment is used each time the enrollment is 
needed in the calculation. 
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Subroutine OUTPUT places the iterations in ascending order 
computes the required percentiles, and prints the output. The 
fifth percentile is computed, for example, by finding the 
midpoint between the fifth and sixth ordered iteration. During 
the trial stages of the program, OUTPUT included instructions 
to print all one hundred iterations to assess the accuracy of 
the steps for ordering the iterations and computing the 
percentiles . 

Pages 78 through 88 contain flow charts for MAIN and 
OUTPUT. Program steps for MAIN and OUTPUT and instructions 
for use of the computer programs are listed in the Appendix. 
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Fig. 3.5. (continued). 
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Figure 3.5 (continued) 

Key to Flow Chart for Main Program 


1. 


Read (a) title of job, (b) number of years to be simulated, 
(c) number of grades to be simulated, (d) indication of 
whether or not sex option is to be used, (e) beginning year 
of simulation, (f) integer to initialize random number 
generator, and (g) variable formats to be used for parameter 
and variable inputs. 


2. 


If the sex option is to be used, GO TO 5. 


3. 


Read all input parameters and variables. 


4. 


GO TO 6. 


5. 


Read all input parameters, including the proportion male for 
each input. 


6 . 


Print information which was read in step 1. 


7. 


If sex option is used, GO TO 10. 


8. 


Print parameter inputs (totals) . 


9. 


GO TO l 1 . 


10. 


Print parameter inputs for totals and for boys. 


11. 


Compute means and standard deviations of input variables 
using approximation formulas for the beta distribution. 


12. 


Print input variables (high estimate, most likely estimate, 1 

low estimate, mean, and standard deviation) . If sex option 
is used, print proportion boys for each variable. 


13. 


Initialize the random number generator. 


14. 


Generate 1000 throw-away random normal deviates. 


15. 


Open DO loop which varies the year of simulation from 1 
to IYEAR. ( IYEAR is the number of years to be simulated.) 
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Figure 3.5 (continued) 

16. Open DO loop which varies grade level from 1 to I GRADE. 

(IGRADE is either 12 or 13, depending on whether or not * 
kindergarten is included. "Grade level 1" refers to 
either grade 1 or grade K.) 

17. IF (I .GT. 1) GO TO 48. (I is the grade level.) 

18. IF (J .GT. LIMIT) GO TO 32. (J is the year? LIMIT is either 
4 or 5, depending on whether or not kindergarten enrollment 
is predicted, and it represents the number of prediction 
years for which historical birth data is available.) 

19. Open DO loop for varying iterations from 1 to 100 in the 
prediction of first grade level enrollment for one of the 
first 4 or 5 years of simulation. This is the first of 
four iterations DO loops in the program. 

20. Subtract deaths from births to obtain tentative prediction. 

21. Add to tentative prediction a randomly drawn preschool net 
migration figure. 

22. Subtract from tentative prediction a randomly drawn figure for 
nonpublic school enrollment. 

23. To obtain the final prediction, add to tentative prediction the 
product of (a) a randomly drawn proportion of retentions in 
the first grade level the previous year and (b) the enroll- 
ment in the first grade level the previous year. 

24. Store final prediction for iteration M, grade level, year 
~ 5 . 

25. If sex option is not used, GO TO 27. 

26. Proceed analogously to steps 20-24 to compute predictions 
for boys for iteration M, grade level 1, year 5:5. 

27. Continue. (End of first iteration DO loop, which varies 
iterations from 1 to 100.) 

28. If sex option is not used, GO TO 30. Otherwise, open DO 
loop for calculating enrollment of girls for grade level 

1, year 5 5 by subtracting male enrollment from total enroll- 
ment to obtain female enrollment. 
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Figure 3.5 (continued) 


29. 


Continue. (End of DO loop for calculating female enrollment.) 


30. 


Call subroutine OUTPUT to print probability tables for 
enrollment predictions for grade level 1, year £ 5 . 


31. 


GO TO 70. 


32. 


Open DO loop for varying iterations from 1 to 100 in the 
prediction of first grade level enrollment for the remaining 
years of the simulation (years in which births must be 
predicted) . This is the second iteration DO loop in the 
program . 


33. 


Open DO loop for calculating births, varying age group of 
women from 1 to 6, by multiplying a randomly drawn birth 
rate by a randomly drawn number of women, and dividing by 
1000. If the sex option is used, calculate male births as 
well as total births. 


34. 


Accumulate the births across age groups. 


35. 


Continue. (End of DO loop for calculating births.) 


36. 


To obtain tentative prediction, subtract from total births 
the product of (a) a randomly drawn preschool death rate 
and (b) total births. 


37. 


Add to tentative prediction a randomly drawn preschool net 
migration figure. 


38. 


Subtract from tentative prediction a randomly drawn figure 
for nonpublic school enrollment. 


39. 


To obtain final prediction, add to tentative prediction the 
product of (a) a randomly drawn proportion retained in the 
first grade level the previous year and (b) the enrollment 
in the first grade level the previous year. 


40. 


Store final prediction for iteration M, grade level 1, 
year -5. 


41. 


If not using sex option, GO TO 43. 


42. 


Proceed analogously to steps 36-40 to calculate predictions 
for boys for the same iteration, grade, and year. 


o 43. 

ERIC 


oq 

continue. (End of the second iteration DO loop, which c,cy 



- 84 - 





Figure 3.5 (continued) 
varies iterations from 1 to 100.) 


44. 


If sex option is not used, GO TO 46. Otherwise, open DO 
loop for calculating enrollment for girls in grade level 
1, year - 5 by subtracting male enrollment from total 
enrollment to obtain female enrollment. 


45. 


Continue. (End of DO loop for calculating female 
enrollment. ) 


46. 


« 

Call subroutine OUTPUT to print probability tables for 
enrollment in grade level 1, year ^5. 


47. 


GO TO 70. 


48. 


If (J .GT. 1) GO TO 67. (J is the year presently simulated.) 


49. 


Open DO loop varying iterations from 1 to 100 in the pre- 
diction of enrollment for a grade level above 1 for the 
first year of the simulation. This is the third iteration 
DO loop of the program. 


• 

o 

in 


As the tenative prediction, use the historical enrollment for 
the previous year and grade. 


51. 


Add to the tentative prediction a randomly drawn net 
migration figure. 


52. 


Add to tentative prediction a randomly drawn figure for net 
transfers to public schools from nonpublic schools. 


53. 


Subtract from tentative prediction the product of (a) the 
previously drawn proportion of students retained in the 
previous grade and year and (b) the enrollment of the 
previous grade and year. 


54. 


Add to tentative prediction the product of (a) a randomly 
drawn proportion retained in the present grade the previous 
year and (b) the enrollment in the present grade the previous 
year . 
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Figure 3.5 (continued) 

55. To obtain the final prediction, subtract from the tentative 
prediction the product of (a) a randomly drawn proportion of 
students dropped from rolls because of death or institution- 
alization during the previous year and (b) the enrollment 
the previous year and grade. 

56. If the grade level is greater than 7, GO TO 58. 

57. GO TO 59. 

58. Adjust final prediction by subtracting the product of (a) 
a randomly drawn proportion of dropouts for the previous 
grade and year and (b) the enrollment in the previous grade 
and year. 

59. Store the final predictions for iteration M, grade level > 1, 
year 1. 

60. If not using the sex option, GO TO 62. 

61. Calculate the predictions for male enrollment by proceeding 
analogously to flow chart steps 50-59. 

62. Continue. (End of the third iteration DO loop, which varies 
iterations from 1 to 100.) 

63. If the sex option is not used, GO TO 65. Otherwise, open DO 
loop for calculating enrollment for girls in grade level >1, 
year 1 by subtracting male enrollment from total enrollment 
to obtain female enrollment. 

64. Continue. (End of DO loop for calculating female enrollment.) 

65. Call subroutine OUTPUT to print probability tables for enroll- 
ment predictions for a grade level > 1 , year>l. 

66. GO TO 70. 

67. Open DO loop for varying iterations from 1 to 100 in the 
prediction of enrollment for a grade level above 1 for the 
remaining years of the simulation. This is the fourth 
iteration DO loop for the program. Obtain predictions by 
proceeding analogously to steps 50-61, using predicted 
enrollment, rather than historical enrollment, for the 
previous year and grade. 
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Figure 3.5 (continued) 



68. Continue. (End of fourth iteration DO loop.) 

69. If using the sex option, calculate female enrollment. Then 
call subroutine OUTPUT. 

70. Continue. (End of the DO loop which varies grade level 
and was opened in step 16.) 

71. Store predictions across grades as "previous year" rather 
than "present year" predictions. 

72. Continue. (End of DO loop which varied year of simulation 
and was opened in step 15.) 

73. STOP. 
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. Flow chart for Subroutine OUTPUT. 
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Figure 3.6 (continued) 



1. Open DO loop for putting predictions in ascending order. 

2. Continue. (End of ordering DO loop.) 

3. Compute the required percentiles. For example, the 5th 
percentile is the midpoint between the 5th and 6th 
prediction in the ordered list. 

4. Input the probability values associated with the percentiles. 
For example, the probability that the prediction will be 
less than or equal to the 5th per entile value is .05. 

5. Print the percentiles and the associated probabilities. 

6. RETURN to main program. 
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To complete the achievement of the first objective 
of the study, simulations were performed using enrollment 
and prediction data from Brockton, Massachusetts. Three 
factors were varied in the simulations: symmetry of the 

input, the random number chosen to initialize the subroutine 
GAUSS, and the use of the option for the prediction of 
enrollments separately by sex. In descriptions of the 
simulations, the set of input data in which the high and 
low estimates for the variables are equidistant from the 
most likely estimates is called the "symmetrical" data; 
the one in which they are not equidistant is called the 
"skewed" data. Two integers were chosen to initialize GAUSS: 
the first was 420013363; the second, 083632429. A simulation 
was performed under each at the following conditions : 

(1) symmetrical data, first random number, 
sex option not used, 

(2) skewed data, first random number, sex 
option not used, 

(3) symmetrical data, second random number, 
sex option not used, 

(4) symmetrical data, first random number, 
sex option used. 

The simulations were run on the IBM System/360 Model 40 Computer 
at Boston College. 

The second major objective was to investigate the 
predictive validity, reliability, and concurrent validity 
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of the method with the use of the Brockton enrollment data 
and predictions. The following four null hypotheses were 
formulated to test predictive validity, reliability, and 
concurrent validity, respectively, with the last two 
hypotheses outlining the two tests of concurrent validity: 

(1) There is no difference between 

the agreement of the percentage of survival 
projections with the actual Brockton enroll- 
ments and the agreement of the multivariable 
predictions with the actual Brockton enrollments. 

(2) There is no difference between the predicted 
enrollments produced by the two simulations using 
different initial values for the random number 
generator. 

(3) There is no difference between the enrollments 
predicted with the multivariable method and 
the .50 values produced by the simulation. 

(4) There is no difference between the .50 values 
produced by the simulation using symmetrical 
data and those produced using skewed data. 

The research hypothesis corresponding to the first null 
hypothesis is that the multivariable predictions agree more 
closely with the actual Brockton enrollment figures than do 
the percentage of survival projections; this hypothesis 
is designed to obtain data supporting the selection of the 
multivariable method as the prediction method on which the 
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simulation was built. The second, third, and fourth 
research hypotheses, like the corresponding null 
hypotheses, are stated in terms of no differences. They 
are based on the idea that probable difference would 
indicate undesirable noise factors in the simulation out- 
put. 

The predictive validity of the distributions of enroll- 
ments produced by the simulation could not be meaningfully 
tested since they were to a large extent determined by the 
high and low estimates which were chosen arbitrarily by the 
present investigator. The assignment of high and low 
estimates was necessary because data for these estimates were 
not available. Thus the test of predictive validity is 
actually a test of the predictive validity of the prediction 
model on which the simulation is based; it is a comparison of 
the predictive accuracy of the multivariable model with that 
of the percentage of survival model, the latter being the 
more frequently used method. The perqentage of survival pro- 
jections were computed by the present investigator from 
enrollment figures (Brockton School Department, 1959-1969). 

i 

Six years of historical data were used to obtain the five 
survival percentages which were then averaged. The number of 
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survival percentages to be averaged was chosen after an 
examination of the literature; averages of three to ten 
numbers are typical, and five is probably the most common 
(Hunt, 1967; Metcalf & Eddy, Inc., 1968). 

The multivariable prediction figures were obtained 
by using the data from the Center for Field Studies. 

(1964; Appendix). Since it was necessary to make some 
modifications in the data for use in the simulation, the 
predictions calculated by the Center were not adequate 
for the test; the predictions were recalculated using the 
modified data. 

To compare the accuracy of the two methods, the abso- 
lute differences between enrollments predicted by the 
multivariable method and actual enrollments in Brockton 
were compared to the absolute differences between enroll- 
ments predicted by the percentage of survival method and 
actual Brockton enrollments. The Wilcoxon matched-pairs 
signed-ranks test was chosen for this comparison. The 
comparison is one of two related samples of interval data. 
The samples were related since the predictions could be 
paired by grade and year. The level of measurement was 
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interval since the data consisted of numbers of students. 

There was little evidence to indicate that the data would 
satisfy parametric assumptions. Two nonparametric tests 
listed in Siegel (1956: inside covers) for data with these 
characteristics are the Walsh test and the randomization test 
for matched pairs. The Walsh test requires an assumption which 
does not necessarily hold for the present data. The assumption 
is that the differences between the matched pairs are drawn 
from symmetrical populations (Siegel, 1956:83). The randomi- 
zation test is not based on such an assumption, but because 
of computational cumbersomeness, its use is recommended only 
for very small samples; the Wilcoxon matched-pairs signed- 
ranks test is suggested as an efficient alternative (Siegel, 
1956:91). 

Brockton enrollment figures were available for each 
of 12 grades for the six years 1964-1969. Comparisons were 
made once with an N of 72 and then separately by grade and by 
year with N's of six and twelve. An N of twelve, and 
certainly an N of 72, is large enough to make the computa- 
tions cumbersome; thus the Wilcoxon test was chosen for these 
comparisons. It was also chosen for the comparisons with an 
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N of six so that the tests would be consistent and comparable. 

The Wilcoxon matched-pairs signed-ranks test is actually 
a randomization test on the ranks requiring an ordered metric 
scale in which the differences between pairs can be ranked in 
order of absolute size (Siegel, 1956:91, 75-76). A two-tailed 
test was performed to avoid excluding the possibility that 
the percentage of survival predictions were more accurate. 

The reliability was tested by comparing the two simu- 
lations whose input and options differed only in the seeds, 
the random numbers used to initialize GAUSS. Reliability 
is used in this study to refer to the relationship between 
the output of two runs of the simulation using two different 
seeds; this use of the term reliability is not to be confused 
with its use in some prediction studies to mean the accuracy 
of prediction. The reliability was tested by comparing the 
one hundred iterations produced for each grade and year by 
one simulation to those produced by the other simulation. 

These comparisons were made by the Kologorov-Smirnov two- 
sample test (Siegel, 1956:127-136). Each run of the simulation 
produces one hundred iterations per grade and year. With 
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twelve grades and twelve simulation years, the data con- 
sisted of 144 pairs of independent samples with an N 
of one hundred. A significance test was performed for 
each pair. The Kolmogorov- Smirnov test was chosen, although 
it assumes only ordinal data, because it is sensitive to all 
kinds of differences in the distributions from which the 
two samples are drawn; this is important since the simulation 
results depend on the distribution as a whole rather than just 
the central tendency. The Wald-Wolfowitz runs test also has 
this characteristic, but it probably has less power-efficiency 
than does the Kolmogorov-Smirnov test (Siegel, 1956:144-145). 

A two-tailed Kolmogorov-Smirnov test was performed using 
KOLM2, a program from the IBM System/360 Scientific Subroutine 
Package (1968:65-66). 

Concurrent validity was investigated by testing null 
hypotheses three and four. The purpose of the third hypothesis 
was to detect random or systematic errors in the .50 probability 
level predictions resulting from the use of different pseudo- 
random numbers; the purpose of the fourth hypothesis was to detect 
error in the .50 probability predictions resulting from con- 
verting the distributions of "skewed" estimates from beta 
distributions to normal distributions. The Wilcoxon matched- 
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pairs signed-ranks test was chosen for this comparison. Like 
the data for the test of predictive validity, the data con- 
sisted of two related samples of interval data; the samples 
are related since the predictions are paired by grade and 
year of prediction. Other characteristics of the data 
relevant to choosing the test were like those of the predictive 
validity data. A Wilcoxon test was performed with an N of 
144 (twelve grades and twelve years of prediction) using the 
z approximation, and tests were also performed separately by 
year using the T statistic. The test of hypothesis three was 
two-tailed, but the test of hypothesis four was one-tailed. 

It was possible to predict a direction of significance for 
hypothesis four because the input data were skewed in the 
direction of greater numbers of students. 
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Chapter IV 



Results of the Study 

The first objective of the study involved the 
collection of data, adaption of the multivariable model, 
writing the computer programs and instructions for the 
user, and performing the simulations. Discussed in 
Chapter III are the outcomes of all of these activities 
except the performance of the simulation. In Table 4.1 
(p.98) and 4.2 (pp. 99-10]) are samples of the simulation out- 
out. Table 4.1 contains a sample of the output for the 
simulation using symmetrical data, the first random number 
starter, and no sex option. Table 4.2 contains a portion 
of the output for the simulation using symmetrical data, 
the first random number starter, and the sex option. 

With twelve grades and twelve years of simulation, 
running the simulation without the sex option required 
88,736 bytes in storage and 46 minutes and 13 seconds of 
compilation and execution time. The time and storage 
requirements placed the program charges at $75 per hour 
at the Boston College Computer Center? thus, the cost of 



TABLE 4.1 



PROBABILITY THAT TOTAL ENROLLMENT IN GRADE 7 IN 1971 
WILL BE LESS THAN THE SPECIFIED PREDICTED ENROLLMENT 



PROBABILITY 


PREDICTED ENROLLMENT 


.05 


1512. 


.10 


1525. 


.20 


1546 


.30 


1559. 


.40 


1576. 


.50 


1586. 


.60 


1600. 


.70 


1614. 


.80 


1629. 


.90 


1658. 


.95 


1677. 



PROBABILITY THAT TOTAL 


ENROLLMENT IN GRADE 7 


IN 1971 


WILL BE GREATER THAN THE SPECIFIED PREDICTED 


ENROLLMENT 


PROBABILITY 


PREDICTED ENROLLMENT 


.05 


1677. 




.10 


1658. 




.20 


1629. 




.30 


1614. 




.40 


1600. 




.50 


1586. 




.60 


1576. . 




.70 


1559. 




.80 


1546. 




.90 


1525. 




.95 


1512. 
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TABLE 4.2 



PROBABILITY THAT TOTAL ENROLLMENT IN GRADE 2 IN 1975 
WILL BE LESS THAN THE SPECIFIED PREDICTED ENROLLMENT 

PROBABILITY PREDICTED ENROLLMENT 



.05 


1899. 


.10 


1991. 


• 

ro 

o 


2042. 


.30 


2103. 


.40 


2177. 


.50 


2234. 


.60 


2277. 


.70 


2351. 


o 

00 

• 


2384. 


.90 


2464. 


.95 


2561. 



PROBABILITY THAT TOTAL ENROLLMENT IN GRADE 2 IN 1975 
WILL BE GREATER THAN THE SPECIFIED PREDICTED ENROLLMENT 



PROBABILITY PREDICTED ENROLLMENT 



.05 

.10 

.20 

.30 

.40 

.50 

.60 

.70 

.80 

.90 

.95 



2561. 

2464. 

2384. 

2351. 

2277. 

2234. 

2177. 

2103. 

2042. 

1991. 

1899. 
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TABLE 4.2 (continued) 



PROBABILITY THAT MALE ENROLLMENT IN GRADE 2 IN 1975 
WILL BE LESS THAN THE SPECIFIED PREDICTED ENROLLMENT 

PROBABILITY PREDICTED ENROLLMENT 



05 


989 


10 


1031 


20 


1068 


30 


1099 


40 


1138 


50 


1165 


60 


1192 


70 


1233 


80 


1255 


90 


1304 


95 


1342 



PROBABILITY THAT MALE ENROLLMENT IN GRADE 2 in 1975 
WILL BE GREATER THAN THE SPECIFIED PREDICTED ENROLLMENT 



PROBABILITY 


PREDICTED ENROLLMENT 


.05 


1342. 


.10 


1304. 


.20, 


1255. 


.30 


1233. 


.40 


1192. 


.50 


1165. 


.60 


1138. 


.70 


1099. 


.80 


1068. 


.90 


1031. 


.95 


989. 
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TABLE 4.2 (continued) 



PROBABILITY THAT FEMALE ENROLLMENT IN GRADE 2 IN 1975 
WILL BE LESS THAN THE SPECIFIED PREDICTED ENROLLMENT 

PROBABILITY PREDICTED ENROLLMENT 



,05 


912. 


.10 


946. 


.20 


978. 


.30 


999. 


.40 


1035. 


.50 


1067. 


.60 


1095. 


.70 


1112. 


.80 


1137. 


.90 


1174. 


.95 


1204. 



PROBABILITY THAT FEMALE 


ENROLLMENT 


IN GRADE 2 IN 1975 


WILL BE GREATER THAN THE 


SPECIFIED 


PREDICTED ENROLLMENT 


PROBABILITY 


PREDICTED ENROLLMENT 


.05 




1204. 


.10 




1174. 


.20 




1137. 


.30 




1112. 


.40 




1095. 


.50 




1067. 


.60 




1035. 


.70 




999. 


.80 




978. 


.90 




946. 


.95 




912. 
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simulation without the sex option was $57.75. Simulation 
using the sex option required 89,224 bytes in storage and 
one hour 18 minutes and 54 seconds, costing $98.62. 

The second major objective was the investigation of 
predictive validity, reliability, and concurrent validity. 
Statistical tests were performed for each of the four 
hypotheses previously stated (p. 90). The first involved 
a comparison of the predictive validity of the multivariable 
and percentage of survival methods; the second, an investigation 
of reliability by a comparison of the output of two different 
simulations. The third and fourth were set up to test con- 
current validity by comparisons of the .50 values produced 
by the simulation with figures produced by the multivariable 
method without simulation and with figures produced by the 
simulation using skewed data. 

The testing of the hypotheses resulted in statistically 
significant differences for all four. Although the test for 
predictive validity was a two-tailed test, the significant 
difference was in the direction of better predictive accuracy 
for the multivariable method, the prediction method on which the 
simulation is based. The finding of significant differences 
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for the other hypotheses, however, requires additional interpre- 
tation; since it indicates the presence of "errors," it 
is necessary to examine their bearing on the usefulness 
and validity of the simulation procedure. 

The investigation of predictive validity consisted 
of Wilcoxon signed-ranks matched-pairs tests with the 
absolute differences between the actual Brockton enroll- 
ments and the multivariable predictions being compared with 
the absolute differences between the actual Brockton enroll- 
ments and the percentage of survival projections. Computation 
instructions and significance tables were found in Siegel 
(1956:75-83, 254). Siegel recommended that a z approximation 
to the T, the usual statistic for the Wilcoxon, be used for 
sample sizes over 25. The level of significance chosen for 
rejection of the null hypothesis was .05. Using a two-tailed 
test and an N of 72, combining the 12 grades and six years 
of prediction, the value of z was -2.29, which is significant 
at the .022 probability level, indicating better predictive 
accuracy for the multivariable method. Analyzing the data 
separately by year of prediction, there were six significance 
tests with an N of 12. Although the values of T for five of 
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six tests were in the direction of better predictive 
validity for the multivariable method, none of the T 
values were significant. When the data were analyzed 
separately by grade level with N's of six, the values of 
T for grades 2, 3, 4, and 5 were significant at or beyond 
the .05 level in the direction of better predictive validity 
for the multivariable method. The values of T for the other 
g::ades were not significant; five were in the direction of 
better predictive validity for the multivariable method; grades 
7, 8, and 12 were in the opposite direction. Since the test 
using an N of 72 was significant at the .022 level, the 
hypothesis of no difference in predictive validity was 
rejected . 

In summary, the significance of the N of 72 can be 
viewed as composed of the results analyzed separately by 
year, which are for the most part in the right direction, 
but which lack a large enough N for significance. It can 
also be viewed as a summary of the significance tests 
computed separately by grade. Further studies are needed 
to interpret the patterns of significance, lack of signifi- 
cance, and direction of the differences in the significance 
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tests for the separate years and grades. For example, 
as more years of enrollment figures are available for 
Brockton, the tests computed separately by grade can be 
conducted with a larger N. Examination of the determinants 
of the Brockton enrollments might show why in some instances 
the percentage of survival method was a better predictor 
than the multivariable method. 

The significance test used for testing the reliability 
was the Kolmogorov-Smirnov two-sample test, which measures 
agreement between two cumulative distributions (Siegel, 
1956:127-136). In the present case, the two distributions 
were the 100 predictions produced by the simulation for a 
given grade and year using one random number starter and the 
100 predictions produced for the same grade and year using 
different random number starter. Since the simulation 
produced output for 12 grades and 12 years, 144 tests of 
significance were performed. The level of significance chosen 
for rejection of the null hypothesis was .05. Only 7 of the 
144 failed to reach significance at the .05 level or beyond. 
Thus the null hypothesis of no difference in outputs of the 
simulation was rejected. The computer program printed the 
significance levels correct to five decimal places; 
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significance levels for 109 of the 144 tests were .00000. 

The significance test used for the third and fourth 
hypotheses was the Wilcoxon matched-pairs signed ranks 
test. Since the Kolmogorov-Smirnov test showed that the 
simulation outputs using different pseudo-random numbers 
were significantly different, the Wilcoxon tests for the 
third hypothesis were performed separately on the outputs 
of the simulations differing in the pseudo-random numbers 
used. Two-tailed tests were performed; as in the tests of all 
four hypotheses, the .05 level of significance was chosen as 
the rejection level. The T statistic was used for the tests 
performed separately by year; the z approximation was used 
for the overall tests of significance. The tests with the 
data from the simulation using the first random number 
showed a z of -3.832, which is significant beyond the .05 
level; however, none of the T's were significant at .05 
although all were in the direction of higher predictive 
figures for the simulation data. The output using the second 
random number produced a £ of -2.7575, also significant beyond 
the .05 level. Only one year, 1966, showed significant 
differences at the —.05 level; for all the years except 1972 
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and 1973 the differences were in the direction of higher 
predictive figures for simulation data. Since the overall 
significance tests for both sets of data produced significant 
z's, the third null hypothesis was rejected, indicating a 
lack of perfect correspondence between the 0.50 simulation 
figures and the nonsimulation figures. 

The wilcoxon tests for the fourth null hypothesis, 
testing the effect of skewing the input, were one-tailed 
tests, since the directional effect of the skewing could 
be predicted. As in the tests of the third null hypothesis, 
the T statistic was used for tests performed separately by 
year; the z approximation was used for overall tests of 
significance. The tests produced a z of -10.41; the z and 
all of the T's were significant beyond the 0.05 level. 

These results indicated rejection of the fourth null hypothesis. 

Interpretation of the significance tests for the null 
hypotheses is aided by considering their impact on the 
simulation output in terms of numbers and percentages of 
students. Tables 4.3, 4.4, and 4.5 (pp. 109-121 ) contain 
the data used in the tests of predictive validity, re- 
liability, and concurrent validity, respectively. Table 4.6 
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(p. 122) contains a summary of the mean algebraic and 
arithmetic differences between columns of enrollment 
figures on the three preceding tables and includes the 
corresponding percentages. Table 4.6 is useful for com- 
paring the impact on the simulation results of the signif- 
icant differences found by the tests of significance. 

The percentage of survival and multivariable model 
produced significantly different predictions. However, 
the extent of the differences can be examined in other 
ways. One measure of prediction accuracy was defined 
by Greenawalt and Mitchell: "It was assumed that a forecast, 

having run seven years, which predicted enrollment within 
plus or minus 10 per cent of the actual enrollment was 
'accurate' [1966:8}." Using the ten percent standard, 
nine of the 72 predictions using the multivariable method 
were "inaccurate"; ten of the 72 percentage of survival 
projections were "inaccurate," The figures which were not 
within ten percent of the actual enrollments are marked by 
asterisks in Table 4.3. The scoring procedure used here was 
somewhat different from that of Greenwalt and Mitchell, since 
the prediction period was six, rather than seven, years and 
accuracy scores were calculated for all grades and years. 
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TABLE 4 . 3 



x’he Comparison of the Actual Brockton Enrollments 
With Those Predicted by the Multivariable Technique 
and Those Projected by the Percentage of Survival Technique 



A = Actual Enrollments in Brockton 
Public Schools 

B = Enrollments Predicted by the 

Multivariable Technique Without 
Simulation 

C = Enrollments Projected by the 

Percentage of Survival Technique 







Date 


10/1/64 














rade 


A 


B 




C 


A 


-B 


A 


-C 


lA-Bl 


’ lA-cl 


1 


1639 


1662 




1649 


— 


23 




10 


+ 


13 


2 


1488 


I.521 




1531 


mm 


33 


— 


43 




10 


3 


1391 


1423 




1434 


- 


32 


— 


43 




11 


4 


1306 


1324 




1329 


- 


18 


* 


23 




5 


5 


1240 


1246 




1278 


- 


6 


— 


88 


mm 


32 


$ 


1157 


1192 




1211 


- 


35 


* 


54 


mm 


19 


? 


1131 


1170 




1214 


- 


39 


— 


83 


mm 


44 


3 


1276 


1175 




1 1 72 


+ 


1 


+ 


4 


mm 


3 


9 


1064 


1 084 




1064 


- 


20 




0 


+ 


20 


1 0 


1055 


927 




992 


-1 


28* 


+ 


63 


+ 


65 


11 


105? 


1081 




1042 


- 


29 


+ 


10 


+ 


1 9 


12 


981 


990 




924 


- 


59 


+ 


7 


+ 


52 




Arithmetic 


mean 


s 


35. 


25 


v . 


50 


24. 


42 




Algebraic mean 


£5 mm 


13. 


75 


-17. 


50 


3. 


75 








Date 


1 0/1 /6 3 












1 


1659 


173° 




1757 


— 


71 




98 


_ 


27 


V 


1580 


1559 




15 55 


+ 


21 


+ 


25 




4 


3 

i, 


1.475 


1519 




15^8 


mm 


44 


— 


73 


_ 


29 


4 


1375 


1408 




1521 


- 


33 


— 


46 


_ 


13 


5 


1300 


1311 




1351 


- 


11 


— 


51 


_ 


40 




1233 


1235 




1288 


mm 


2 


* 


55 


mm 


53 


7 

O 


1195 


1207 




1275 


- 


12 


— 


80 




68 


O 


1260 


1172 




1211 


+ 


88 


+ 


4q 


+ 


39 


9 


I.13O 


1145 




11 31 


wm 


15 


_ 


1 


+ 


1 4 


1 0 


1055 


996 




1050 


+ 


59 


+ 


5 


4* 


54 


1 ] 
1 o 


981 


888 




915 


+ 


93 


+ 


66 


+ 


27 


J £ 


882 


1022 




922 


-140* 


- 


40 


ti 


00 




Arithmetic 


mean 


cr 


49. 


08 


49. 


08 


39. 


00 




Algebraic mean 


= 


5. 


58 


-24. 


92 


0. 


00 



. 3J5 



k 



- 0 .( 10 - 







TABLE 4. ' 


) (continued) 










Date 


1 0/1/66 








rade 


A 


B 


C 


A-B 


A-C 


IA-BI - lA 


1 


3.672 


1836 


3.871 


-3.64 


-199* 


- 35 


2 


1596 


1622 


1657 


- 26 


- 61 


- 35 


3 


3 564 


1560 


1572 


+ 4 


- 8 


- 4 


4 


1452 


1503 


1534 


- 51 


- 82 


- 31 


5 


136? 


1395 


1445 


- 28 


- 78 


- 50 


6 


1332 


1300 


1362 


+ 32 


- 30 


+ 2 


7 


1276 


1251 


1356 


+ 25 


- 80 


- 55 


8 


1208 


1208 


1272 


0 


- 64 


- 64 


9 


1097 


1173 


1168 


- 76 


- 71 


+ 5 


1 0 


1127 


952 


11 16 


+1 75* 


+ 11 


+164 


1 3 


1019 


924 


969 


+ 95 


+ 50 


+ 45 


3 2 


829 


83 7 


810 


+ 12 


+ 19 


- 7 




Arithmetic mean : 


s 


57.33 


62.75 


41.42 




Algebraic mean 


= m 


- 0.17 


-49.42 


- 5.42 






Date 


10/1/67 








1 


1804 


1976 


2009 


-172 


-205* 


- 35 


2 


1612 


1719 


1764 


-107 


-152 


- 45 


3 


1567 


1622 


1675 


- 55 


-108 


- 53 


4 


1468 


1545 


1558 


- 77 


- 90 


- 13 


5 


1452 


1488 


1 560 


- 36 


-108 


- 72 


6 


1435 


1382 


1457 


+ 53 


- 22 


+ 31 


7 


1398 


1316 


1434 


+ 82 


- 36 


+ 46 


8 


1312 


1252 


1353 


+ 60 


- 41 


+ 39 


9 


1095 


1210 


1227 


-115* 


-1 32* 


- 17 


10 


1013 


1002 


1153 


+ 11 


-146* 


-129 


11 


1013 


939 


1029 


+ 74 


- 16 


+ 58 


12 


866 


889 


857 


- 23 


+ 9 


+ 1 Ur 




Arithmetic mean 


= 


72.08 


88.25 


44 . 1 7 




Algebraic mean 


C > 


-25.41 


-86.75 


-16.17 






Date 


10/3/68 








1 


1718 


1990 


2009 


-272* 


-293* 


- 19 


2 


1725 


I85O 


1894 


-125 


-369 


- 44 


3 


1643 


1718 


1783 


- 75 


-140 


- 65 


4 


1563 


1606 


1660 


- 43 


- 97 


- 54 


5 


1519 


1531 


1584 


- 12 


- 65 


- 53 


6 


1460 


1475 


1573 


- 15 


-113 


- 98 


7 


1590 


1399 


1539 


+193* 


+ 56 


+135 


8 


1404 


1316 


1433. 


+ 88 


- 27 


+ 6l 


9 


13 58 


1249 


1305 


- 91 


-3.47* 


- 56 


3 0 


1043 


1045 


1211 


- 4 


-170* 


-166 


3 3 


935 


992 


1064 


- 57 


-3.29* 


- 72 


12 


928 


907 


93 0 


+ 23. 


+ 38 


+ 3 




Arithmetic mean 


e 


82.33 


118.50 


68.83 




Algebraic mean 


5 


-32.83 


-3 06.37 


-36.67 
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TABTjE 4.3 (continued) 



Date 10/1/69 



rade 


A 


B 


C 


A-B 


A-C 


Ia-bI- Ia-cI 


1 


1828 


2049 


2069 


-221* 


-241* 


- 20 


7 


1731 


1867 


1894 


-136 


-163 


- 27 


3 


1784 


1847 


1915 


- 63 


- 1 31 


- 68 


4 


1620 


1699 


1767 


- 79 


-14? 


- 68 


5 


1596 


1590 


1688 


+ 6 


- 92 


- 86 


6 


1543 


1517 


1597 


+ 26 


- 54 


- 28 


7 


1703 


1492 


I 656 


+21 1* 


+ 4? 


+1 64 


8 


1570 


1399 


1530 


+ 171# 


+ 40 


+131 


9 


1329 


1324 


1380 


+ 5 


- 51 


- 46 


10 


1147 


1057 


1288 


+ 91 


-141* 


- 50 


11 


1033 


1033 


1117 


0 


- 84 


- 84 


12 


919 


963 


941 


- 44 


- 22 


+ 22 




Arithmetic mean 


s 


87.75 


101.08 


71.17 




Algebraic mean 


rr 


- 2.75 


- 86.58 


- 8.33 
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.T/yBM’4.4 



The 0.50 Probability Figures Calculated by Simulation 
With Symmetrical Data Using Different Seeds for 

Random Number Generator 



Grade 



Date 10/1/64 



Random Seed #1 



Random Seed #2 



Random Seed #1 
-Random Seed #2 



1 


1668 


1670 


- 2 


2 


1520 


1519 


+ 3 


3 


1415 


1419 


- 4 


4 


1332 


1332 


0 


5 


1247 


1245 


+ 2 


s 

O 


1190 


1190 


0 


7 


1172 


1170 


+ 2 


3 


1174 


11 74 


0 


9 


1076 


1087 


-11 


10 


933 


929 


+ 4 


11 


1082 


1083 


- 1 


12 


993 


992 


+ 1 




Arithmetic mean = 




2.33 




Algebraic mean = 




- 0.67 




Date IO/1/65 






1 


1728 


1744 


-16 


2 


1562 


1565 


- 3 


3 


1519 


1520 


- 1 


4 


1403 


1400 


+ 3 


5 


1319 


1318 


+ 1 


6 


1235 


1234 


+ 1 


7 


1204 


1206 


- 2 


3 


1174 


1173 


+ 1 


9 


1149 


11.42 


+ 7 


10 


986 


1012 


-26 


11 


894 


887 


+ 7 


12 


1027 


1025 


+ 2 



Arithmetic mean = 
Algebraic mean = 



5.83 

-2.17 
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TABLE 4.4 (continued) 



Grade 



Date 

Random Seed #1 



10/1/66 
Random Seed #2 



Random Seed #1 
Random Seed #2 



1 


1845 




1839 


+ 6 


2 


1620 




1632 


-12 


3 


1561 




1566 


- 5 


4 


1503 




1501 


+ 2 


5 


1391 




1392 


- 1 


6 


1309 




1306 


+ 3 


7 


1251 




1251 


0 


8 


1206 




1206 


0 


9 


1178 




1179 


- 1 


10 


952 




961 


- 9 


11 


921 




939 


-18 


12 


822 




815 


+ 7 




Arithmetic 


mean = 




5.33 




Algebraic 


mean = 




-2.33 






Date 10/1/67 






1 


1997 




1983 


+1 4 


2 


1726 




1724 


+ 2 


3 


1615 




I63I 


-16 


4 


1556 




1552 


+ 4 


5 


1490 




1488 


+ 2 


6 


1376 




1378 


- 2 


7 


1321 




1323 


- 2 


8 


1251 




1250 


+ 1 


9 


1192 




1199 


- 7 


10 


1011 




1009 


+ 2 


11 


954 




957 


- 3 


12 


882 




904 


-22 




Arithmetic 


mean = 




6.42 




Algebraic 


mean = 




-2.25 



Date 10/1/68 



1 


1993 


1994 


- 1 


2 


1865 


1856 


+ 9 


3 


1718 


1727 


- 9 


4 


1595 


1617 


-22 


5 


1542 


1550 


- 8 


6 


1478 


1476 


+ 2 


7 


1396 


1398 


- 2 


8 


m 


1320 


+ ^ 


9 


1254 


+ 4 


1C 


1041 


1030 


+11 


11 


1006 


991 


+15 


12 


914 


925 


-11 




Arithmetic mean *= 




8.08 




Algebraic mean - 




-0.75 
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TAfitE 4,4 (continued) 
Date 10/1/69 



-iU.4- 



Grade 



Random Seed #1 



Random Seed #2 



1 


2061 


2049 


2 


1869 


1874 


3 


I863 


1840 


4 


1705 


1712 


5 


1583 


1605 


6 


1527 


1530 


7 


1491 


1494 


8 


1397 


1396 


9 


1318 


1325 


10 


1092 


1093 


11 


1020 


1023 


12 


967 


954 



Arithmetic mean « 
Algebraic mean = 

Date 10/1/70 



1 


2099 


2033 


2 


1923 


1922 


3 


1865 


1873 


4 


1843 


181.4 


5 


1685 


1701. 


6 


1566 


1589 


7 


15^5 


1555 


8 


1491 


1491 


9 


1471 


1^55 


10 


1139 


1153 


11 


1072 


1 086 


12 


982 


990 



Arithmetic mean - 
Algebraic mean = 

Date 10/1/71 



1 


2139 


2149 


2 


1988 


1920 


3 


1920 


1924 


4 


1844 


I856 


5 


1824 


1802 


6 


1669 


1684 


7 


1586 


1607 


8 


1544 


1551 


9 


1587 


1-576 


1.0 


1290 


1281 


11 


1122 


1144 


12 


IO38 


1050 



Arithmetic mean = 
Algebraic mean = 



Random Seed #1 
-Random Seed #2 

+12 

- 5 

+23 

- 7 

-22 

- 3 

- 3 

+ 1 

- 7 

- 1 

- 3 
+13 

8.33 
-0, 1 7 



+66 

+ 1 
- 8 

+29 

-16 

-23 

-10 

0 

+16 

-14 

-14 

- 8 

17.08 

1.58 



-10 

+68 

- 4 

-12 

+22 

-15 

-21 

- 7 
+11 

+ 9 
-22 
-12 

17.75 

O.58 
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Grade 

1 

2 

3 

4 

5 

6 



Date 10/1/72 

Random Seed #1 Random Seed #2 



2228 

1997 

1974 

1915 

1830 

1807 

1688 



2207 

2011 

1925 

1898 

1843 

1788 

1702 



1Z1 



Random Seed #1 
Random Seed #2 

+21 

-14 

+49 

+17 

-13 

+19 

-14 



8 


1583 




1610 


-27 


9 


1645 




1653 


- 8 


10 


1400 




1390 


+10 


11 


1273 




1265 


+ 8 


12 


1083 




1095 


-12 




Arithmetic mean 


= 




17.67 




Algebraic mean 


tr 




3.00 




Date 


10 / 1/73 






1 


2286 




2287 


- 1 


2 


2083 




2051 


+32 


3 


1997 




2000 


- 3 


4 


1956 




1908 


+48 


5 


1907 




1888 


+19 


6 


1818 




1824 


- 6 


7 


1829 




1809 


+20 


8 


1686 




I 696 


-10 


9 


1678 




1707 


-29 


10 


1447 




1459 


-12 


11 


1379 




1367 


+12 


12 


1224 




1224 


0 




Arithmetic mean 


- 




16.00 




Algebraic mean 






5.83 




Date 


10 / 1/74 






1 

2 


2384 

2149 




2393 

2144 


- 9 
+ 5 


3 


2087 




2052 


+35 


4 


1970 




1976 


- 6 


5 


1937 




1891 


+46 


6 


1894 




1872 


+22 


7 


1840 




1849 


- 9 


8 


1827 




1812 


+15 


9 


1791 




1794 


- 3 


10 


1490 




1521 


-31 


11* 


1428 




1433 


- 5 


12 


1321 




1317 


+ 4 




Arithmetic mean 


rr 




15.83 




Algebraic mean 


TZ 




5.33 



o 
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TABLE 4 . 4 (continued ) 



0"ade 

1 

2 

3 

4 

5' 

6 

7 

8 

9 

10 

11 

1.2 



Date 10/1/75 

Random Seed #1 Random Seed #2 



Random Seed #1 
-Random Seed #2 



2457 

2234 

2151 

2070 

1949 

1917 

1917 

1842 

1902 

1580 

1465 

1373 



2412 

2230 

2147 

2038 

1950 

1883 

1897 

I85O 

1908 

1593 

1494 

1376 



+45 
+ ’ 4 
+ 4 
+32 
- 1 

+34 
+20 
- 8 
- 6 

-13 

-29 

- 3 



Arithmetic mean = 
Algebraic mean = 



16.58 

6.58 



o 



122 



p 



1 



I 

I 



J 
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TABLE- 4.5 



The Comparison of the Non-simulation Multivariable 
Predictions with the 0.50 Probability Figures for 
the Symmetrical and the Skewed Data 



A = Enrollments Predicted by Multi- 
variable Technique Without 
Simulation 

B = 0.50 Probability Level Prediction 
Calculated by Simulation Using 
Symmetrical Data 

C = 0.50 Probability Level Prediction 
Calculated by Simulation Using 
Skewed Data 



Date 10/1/64 



o 

ERIC 



rade 


A 


B 


C 


A-B 


BtC 


1 


1.662 


1668 


1695 


- 6 


- 27 


2 


1521 


1520 


1 521 


+ 1 


- 1 


3 


1423 


l4l 5 


1417 


+ 8 


- 2 


4 


1324 


1332 


1334 


- 8 


- 2 


5 


1246 


124? 


1249 


- 1 


- 2 


6 


1192 


1190 


1191 


+ 2 


- 1 


7 


1170 


1172 


1174 


- 2 


- 2 


8 


11*7*5 


1174 


1175 


+ 1 


- 1 


9 


1084 


1076 


1091 


+ 8 


- 15 


10 


927 


933 


947 


- 6 


- 14 


11 


1081 


1082 


1095 


- 1 


- 13 


"12 


990 


993 


1001 


- 3 


- 8 




Arithmetic mean = 




3.92 


7.33 




Algebraic 


; mean = 




-0.58 


-7.33 






Date IO/I/65 







1 


1730 


1728 


1757 


+ 2 


- 29 


2 


1559 


1562 


1592 


- 3 


- 30 


3 


1519 


1519 


1521 


0 


- 2 


4 


1408 


1403 


1406 


+ 5 


- 3 


5 


1311 


1319 


13?3 


- 8 


- 4 


6 


1235 


1235 


1240 


0 


- 5 


7 


1207 


1204 


1206 


+ 3 


- 2 


8 


1172 


1174 


1177 


- 2 


- 3 


9 


1145 


1149 


1165 


- 4 


- 16 


10 


996 


986 


1 009 


+ 10 


- 25 


11 


888 


894 


919 


- 6 


- 25 


12 


1022 


1027 


Ip 48 


- 5 


- 21 




Arithmetic mean = 




3.33 


13.75 




Algebraic 


mean = 


ill 


0 


-13.75 



J 
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TABCj* 4.5 (continued) 
Date 10/1/66 



Grade 


A 


B 


C 


1 


I836 


1845 


1879 


2 


1622 


1620 


1647 


? 


1560 


1561 


1590 


4 


1503 


1503 


1509 


5 


1395 


1391 


1394 


6 


1300 


1309 


1314 


7 


1251 


1251 


1257 


8 


1208 


1206 


1210 


9 


1173 


1178 


1199 


10 


952 


952 


983 


11 


924 


921 


958 


12 


817 


822 


853 



Arithmetic mean = 
Algebraic mean = 



Date 10/1/67 



1 


1976 


1997 


2032 


2 


1719 


1726 


1757 


3 


1622 


1615 


1644 


4 


1545 


1556 


1584 


5 


1483 


1490 


1494 


6 


1382 


1376 


1382 


7 


1316 


1321 


1328 


8 


12 51 


1251 


1258 


9 


1210 


1192 


1209 


10 


1002 


1011 


1045 


11 


93° 


95'<- 


999 


12 


889 


882 


923 



Arithmetic mean = 
Algebraic mean = 



Date 10/1/68 



1 


1990 


1993 


2 


I85O 


I856 


3 


1718 


1718 


4 


1606 


1595 


5 


1531 


1542 


6 


1475 


1478 


7 


1399 


1396 


8 


1316 


1323 


9 


1249 


1258 


10 


1045 


1041 


11 


992 


1006 


12 


907 


914 



2026 

1902 

1756 

1625 

1575 

1483 

1403 

1331 

1285 

1073 

1053 

963 



A-B 

- 9 
+ 2 

- 1 
0 

+ 4 

- 9 

0 

+ 2 

- 5 

0 

+ 3 

- 5 

3.33 

-1.5 



- 21 

- 7 
+ 7 

- 11 
- 2 
+ 6 

- 5 
+ 1 

+ 18 
“ 9 

- 15 
+ 7 

9.08 

-2.58 



- 3 

- 6 

0 

+ 11 
- 11 

- 3 

+ 3 

- 7 

- 9 

+ 4 

- 1 4 

- 7 



Arithmetic mean * 
Algebraic mean = 



6.50 

-3.50 



B-C 

- 3^ 

- 27 

- 29 

- 6 

- 3 

- 5 

- 6 

- 4 

- 21 

- 31 

- 37 

- 31 

19.50 

-19.50 



- 35 

- 31 

- 29 

- 28 

- 4 

- 6 

- 7 

- 7 

- 17 

- 34 

- 45 

- 41 



23.67 

-23.67 



- 33 

- 46 

- 38 

- 30 

- 33 

- 5 

- 7 

- 8 

- 27 

- 32 

- 47 

- 49 

32,08 

- 32.08 



o 
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« 



1 



I 



124 



1 



p 



TABLE 4. 5 (continued) 
Date 10/1/69 



-119- 



! rade 


A 


B 


C 


A-B 


B-C 


1 


2049 


2061 


2285 


- 12 


-224 


2 


1867 


1869 


1898 


- 2 


- 29 


3 


1847 


1863 


1897 


- 16 


- 34 


4 


1699 


1705 


1739 


- 6 


- 34 


5 


1590 


1583 


1616 


- 7 


- 33 


6 


1517 


1527 


1561 


- 10 


- 34 


7 


1492 


1491 


1499 


+ 1 


- 8 


8 


1399 


1397 


1404 


+ 2 


- 7 


9 


1324 


1318 


1344 


+ 6 


- 26 


10 


IO56 


1092 


II.30 


- 36 


- 38 


11 


1033 


1020 


IO58 


+ 13 


“ 38 


12 


963 


967 


1018 


- 4 


- 51 




Arithmetic mean = 




9.58 


46.33 




Algebraic mean = 




-5.92 


-46.33 






Date 10/1/70 






1 


2090 


2099 


2338 


“ 9 


-239 


2 


1922 


1923 


2119 


- 1 


-196 


3 


1869 


1865 


1895 


+ 4 


“ 30 


4 


1827 


1843 


1880 


- 16 


“ 37 


5 


1683 


1685 


1715 


- 2 


- 30 


6 


1576 


1566 


1601 


+ 10 


- 35 


7 


1537 


1545 


1581 


- 8 


- 36 


8 


1491 


1491 


1502 


0 


- 11 


9 


1471 


1471 


1504 


0 


- 33 


10 


1151 


1139 


1174 


+ 12 


- 35 


11 


10/! 5 


1072 


1119 


- 27 


- 47 


12 


997 


982 


1023 


+ 15 


- 41 




Arithmetic mean * 




8.67 


64.17 




Algebraic mean = 




-1.83 


-64.17 






Date 10/1/71 






1 


2151 


2139 


2380 


+ 13 


-241 


2 


i960 


1988 


2218 


- 28 


-230 


3 


1922 


1920 


2111 


+ 2 


-1 91 


4 


1851 


1844 


1875 


+ 7 


- 31 


5 


1808 


1824 


1861 


- 16 


- 37 


6 


1667 


1669 


1702 


- 2 


- 33 


7 


1595 


1586 


1621 


+ 9 


- 35 


8 


1538 


1544 


1578 


- 6 


- 34 


9 


1585 


1587 


1617 


- 2 


- 30 


10 


1288 


1290 


1330 


- 2 


- 40 


11 


1136 


1122 


1163 


+ 14 


- 41 


12 


1009 


1038 


1088 


- 29 


- 50 




Arithmetic mean = 




10.83 


32.75 




Algebraic 


mean = 




“3.33 


-82.75 
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STABLE ^.5 (continued) 
Date 10/1/72 



- 120 - 



rade 


A 


B 


C 


A-B 


B-C 


1 


2220 


2228 


2460 


- 8 


-232 


2 


203.8 


1997 


2224 


+ 23 


-227 


3 


1961 


1974 


2203 


- 13 


-229 


4 


1904 


1915 


2099 


- 13 


-3 84 


5 


1835 


1830 


1865 


+ 5 


- 35 


6 


1792 


1807 


1847 


“ 15 


- 40 


? 


1687 


1688 


1726 


- 1 


- 38 


8 


1595 


1583 


1620 


+ 12 


- 37 


9 


3 640 


1645 


1696 


- 5 


- 51 


10 


1393 


1400 


1432 


- 7 


- 32 


13 


1268 


1273 


1325 


- 5 


- 52 


12 


1096 


IO 83 


1129 


+ 13 


- 46 




Arithmetic mean = 




9.67 


100.25 




Ale; 


ebraic mean = 




-I .17 


-3 00.25 






Date 10/1/73 






1 


2289 


2286 


2532 


+ 3 


-246 


2 


2083 


2083 


2308 


- 2 


-225 


3 


203 8 


1997 


2226 


+ 21 


-229 


4 


1942 


1956 


21 85 


r 3 4 


-229 


5 


1886 


1907 


2094 


- 21 


-187 


6 


1818 


1818 


1852 


0 


- 34 


7 


1813 


1829 


1873 


- 16 


- 44 


8 


3.686 


1686 


1724 


0 


- 38 


9 


1695 


1678 


1736 


+ 17 


- 58 


10 


1443 


1447 


1517 


+ 4 


- 70 


13 


13^8 


1379 


1422 


- 31 


- 43 


12 


1220 


1224 


1283 


- 4 


- 57 




Arithmetic mean = 




9.42 


121.67 




Algebraic mean = 




- 2.58 


-321.67 






Date 1-/3 /74 






3. 


2359 


2384 


2643 


- 25 


-259 


2 


2146 


2149 


2377 


- 3 


-228 


3 


2081 


2087 


2320 


- 6 


-233 


4 


1999 


1970 


2193 


+ 29 


-223 


5 


1924 


1937 


2162 


- 13 


-225 


6 


1869 


1894 


2067 


- 25 


-373 


7 


1842 


1840 


1875 


+ 2 


“ 35 


8 


1811 


1827 


1867 


- 16 


- 40 


9 


1782 


1791 


1848 


- 9 


“ 57 


10 


1493 


1490 


1551 


+ 3 


- 61 


11 


1417 


1428 


1499 


- li 


- 73 


12 


1315 


1321 


1374 


- 6 


- 53 




Arithmetic mean = 




12.33 


3 38.1 7 




Algebraic mean = 




- 6.67 


-3 38.17 



1£6 






TABLE 4.5 (continued) 
Date 10/1/75 



-121 



Trade A B 



1 


2428 


245? 


2 


2211 


2234 


3 


2145 


2151 


4 


2060 


2070 


5 


1980 


1949 


6 


1906 


1917 


7 


1893 


1917 


8 


1825 


1842 


9 


1904 


1902 


10 


1572 


1580 


13 


1466 


1465 


12 


1362 


1373 



Arithmetic mean = 
Algebraic mean = 



C 


A-B 


B-C 


2719 


- 29 


-262 


2468 


- 23 


-23k 


2388 


- 6 


-237 


2308 


- 10 


-238 


23 71 


+ 31 


-222 


2142 


- 13 


-22S 


2096 


- 24 


1179 


1881 


- 17 


- 39 


1964 


+ 2 


- 62 


I656 


- 8 


- 76 


1541 


+ 1 


- 76 


1445 


- 11 


- 72 




14,42 


160.17 




- 8.75 


-160.17 



o 



12.7 



TABLE 4.6 



Means and Corresponding Percentages Computed for Data 
Used in the Tests of the Four Null Hypotheses 





Hypothesis ft 1 




% of Actual 




Arithmetic 


% of Actual 


Algebraic 


Actual enrollments 
minus Multivariable 


Mean 


Enrollment 


Mean 


Enrollment 


Predictions (A-B) 


6 3.97 


4.8 


-13.42 


1.0 


Actual enrollments 
minus Percentage of 


75.19 


5.6 


-61.89 


4.6 


Survival Projections 








(A-C) 










|A-B\ - |A-C| 


-p=- 

00 

• 


3.6 


-10.47 


.80 




Hypothesis ft 


2 




% of Seed 




Arithmetic 


% of Seed 


Algebraic 


Seed #1 minus 


Mean 


#1 


Mean 


#1 


Seed #2 (1st 
6 years ) 


6.05 


.44 


- 1.39 


.10 


Seed #1 minus 
Seed # 2 (all 


11.44 


.74 


1.21 


.08 


years) 


Hypothesis ft 


3 


Algebraic 


% of Seed ft\ 




Arithmetic 


% of Seed ft\ 


Multivariable 
minus Symmetrical 
(1st 6 years ) 


Me sol 
5.96 


(Symmetrical) 

.44 


Mean 
- 2.35 


(Symmetrical ) 

.17 


Multivariable 
minus Symmetrical 
(all years) 


8.42 


.54 


- 3.20 


.21 




Hypothesis ft 4 

Arithmetic % of Seed #1 


Algebraic 


% of Seed ft\ 


Symmetrical »•■ 
minus Skewed 


Mean 


(Symmetrical] 


i Mean 

-23.78 


( Symmetrical ) 


(1st 6 years) 


23.78 


1.77 


1.77 


Symmetrical 
minus Skewed 


67.49 


4.38 


-67.49 


4.38 


(all years) 
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rather than for the total enrollment for the ; last prediction 
year. However, the figures indicated that the differences 
in the accuracy of the two methods were not large when 
using this definition of accuracy. 

Perhaps the most meaningful way to interpret the finding 
of significant difference is to compare the differences in 
terms of numbers and percentages of students found under this 
hypothesis to those found under the other three hypotheses. 

(See Table 4.6.) Since predictive validity data were avail- 
able for only six years, the data were compared to those for 
the first six years for the other tests. Absolute differences 
between the predicted and actual enrollment figures were cal- 
culated for both the multivariable and percentage of survival 
methods (Table 4.3). Algebraic differences between these 
absolute differences are also reported in Table. 4.3. The 
arithmetic mean of this column is 48.17; the algebraic mean 
is -10.47. These figures represent the variation in the output 
attributable to the use of different prediction models; the 
variation is large compared to that which can be attributed to 
the use of different pseudo-random numbers, where the mean 
arithmetic difference is 6.05 and the mean algebraic difference 
is -1.39. It also makes a large difference when compared to 
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errors introduced by using simulation instead of direct 
calculation of the multivariable method; the mean arithmetic 
difference for this factor for the first six years is 5.96, 
and the mean algebraic difference is -2.35. However, the 
variation in output caused by skewing the input data is 
somewhat larger; the mean arithmetic difference is 23.78; 
the mean algebraic difference is -23.78. If algebraic means 
are compared, the impact of concurrent validity as measured 
by comparing skewed and symmetrical predictions was greater 
than that made by the choice of the prediction model. Algebraic 
means, instead of arithmetic means, were compared since the 
algebraic mean best represents the advantage of one prediction 
method over the other. One can say that the choice of the pre- 
diction model made a difference in the output of predictions 
over and above the noise caused by imperfect reliability and the 
use of simulation, but the effect was somewhat less than the 
effect caused by skewed data. 

Only 7 of the 144 Kolmogorov- Smirnov tests of signifi- 
cance for reliability failed to reach significance at the 
0.05 level: or beyond. The only difference in the program 
or the input was the choice of the random number to initialize 
the GAUSS, the random number generator. The distributions 
produced by the two simulations should be identical if the 

pseudo-random numbers are truly random and the number of 
iterations used in the simulation is large enough. The lack 
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of correspondence between the two sets of distributions 
raised several questions: 

(1) Were the initial random numbers chosen 
properly? 

(2) Is the random number generator function- 
ing properly according to other tests of 
randomness? 

(3) To what extent is the output of the 
simulation affected by the imperfect 
reliability? 

(4) What steps could be taken to improve the 
reliability? 

The investigator checked to make sure that the 
specifications for choosing the initial random numbers 
were followed exactly. GAUSS requires an odd integer with 
nine or less digits (International Business Machines 
Corporation, 1968:77 and 1959:5). 

In order to answer the question about the proper 
functioning of the random number generator, pseudo-random 
numbers were generated using each of the two seeds. The 
numbers generated using each seed were considered to be 
scores on 20 variables for 500 people; scores were generated 
within persons. Intercorrelations among the variables were 
computed for both sets of data. The two sets of means and 
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variances of the variables were compared by t tests and F 
tests. The matrices were compared by z tests for correlations. 

Twenty t tests were performed to compare the means of 
the twenty variables. Only one was significant? the value 
of t was 2.71, significant at £.01. Of 20 F tests comparing 
the variances of the variables in one matrix to those of the 
other matrix, 4 were significant at £.05, indicating greater 
variance among the numbers produced using the first seed. 
Intercorrelations of variables within the matrices were com- 
pared across matrices. Using z tests for the difference between 
correlations, only 2 of the 190 tests were significant at £.05. 
Thus only the F tests produced much evidence that the pseudo- 
random numbers might not be adequately random. However, 
there were some differences in the numbers used in those 
tests and the numbers used in the simulation. The tests used 
only 10,000 pseudo-random numbers produced by each seed?., the 
simulation employed approximately 100,000. The simulation 
also used the technique of generating 1000 throw-away numbers 
before generating the numbers to use in the predictions . More 
significant is the fact that these tests of randomness are only 
a few of the many tests which could have been performed 
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(International Business Machines Corporation, 1959:7-8). 

Perhaps the most meaningful way to examine the problem 
of reliability is to determine the extent: to which the output 
of the simulation is affected by imperfect reliability. One 
measure of this is the comparison of the fit of the dis- 
tributions to the actual Brockton enrollment figures. For 
this measure the actual enrollment figures were compared to the 
eleven specified probability levels of the predicted enroll- 
ments. An actual enrollment figure was considered to have 
the same probability in both distributions if it fell either 
on or between the same probability levels? using his standard, 
sixty per cent of the actual enrollment figures had the same 
probability. An enrollment figure was considered to have nearly 
the same probability in both distributions if it satisfied the 
above standard or if it fell exactly on a specified probability 
level in one distribution and in the interval next to it in 
the other? 79.2 per cent of the enrollments satisfied this 
criterion. 

Table 4.6 expresses the lack of reliability in terms of 
numbers and percentages of students. For both the calculations 
for the first six years and those for all twelve years, the 
arithmetic and algebraic means were less than those computed 
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to determine the effects of the prediction model and of skewing 
the input data. Percentages of students involved in the reliabil- 
ity calculations ranged from 0.08 per cent, corresponding to the 
algebraic mean for all twelve years, to 0.74 per cent, corres- 
ponding to the arithmetic mean for all twelve years. Percentage 
figures for calculations determining the effects of the pre- 
diction model and skewing ranged from 0.80 per cent, to 5.6 
per cent. Thus the lack of perfect reliability made a minimal 
difference in the outputs relative to the differences made by 
the choice of model and skewing of the input. The percentages 
obtained from the reliability calculations were approximately 
the same as those obtained from the calculations involving the 
differences between simulated and non-simulated output. (See 
Table 4.6} In fact, all of these percentages seem small when 
compared to the standard of accuracy of enrollment prediction 
used by Greenawalt and Mitchell (1966:8): that enrollment 

predictions within plus or minus ten per cent of the actual 
enrollment are considered accurate. 

There are several approaches that could be tried to 
increase the reliability or to diminish the effect of 
unreliability on the model. One would be to try a different 
random number generator; another would be to increase the 
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number of iterations in the simulation so that random drawings 
from distributions of variables would better represent the 
distributions. Also, the format of the output could be 
modified to make it less sensitive; for instance, fewer per- 
centage points could be reported. However, these would be the 
domain of another study. The present study was designed to 
employ a specified pseudo-random number generator, a specified 
number of iterations, and a specified output format, and to test 
the reliability under these conditions. Furthermore, refinement 
of relianility is not considered by the present investigator 
to have the highest priority among problems to be considered 
in improving the simulation because the effects were relatively 
minor. 

The effects of using the simulation rather than direct 
calculation were also relatively minor. (See Table 4.6.) 

For the calculations both for the first six years and for all 
twelve years, the arithmetic means are slightly smaller than 
those in the reliability test and the algebraic means are slightly 
larger. The fact that the means are so similar suggests that 
the differences between the simulated and nonsimulated pre- 
dictions can be accounted for by the lack of reliability in 
the simulations. However, the fact that the algebraic means 
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are greater than those computed for the reliability may 
indicate a directional bias in the simulation, but the 
test is not definitive on this point. A bias might be 
attributed to lack of normality in the simulated distri- 
butions; in this case, the 0.50 figures would not be expected 
to conform exactly to the single figure predictions. 

The number of times single figure multivariable 
predictions fell within the various cumulative probabilities 
of the simulated distributions were tabulated. Ten of the 
144 multivariable prediction figures were the same as the 
0.50 probability figures of the distributions produced by 
simulation using symmetrical data. One hundred twenty-nine 
of the single figures fell on or between the 0.40 and 0.60 
figures, representing the points below which are 40 percent and 
60 percent, respectively, of the figures in the distribution. 
All of the single figures were bextween the 0.30 and 0.70 
pr obab i 1 i ty f igur e s . 

Further studies could attempt to improve corres- 
pondence between simulated and nonsimulated results by improving 
• pliability or by searching for a cause of the possible 
directional bias. Or they could concentrate on changing the 
output format so that the output does not give the impression 
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of accuracy greater than the simulation can provide. Or they 
could study what effect a lack of normality of the simulated 
distributions has on the correspondence of the single figure 
prediction to various points in the distributions. However, 
in the opinion of the present investigator, these studies do 
not have highest priority since the difference effects are 
relatively small. 

Differences which are not as small are revealed by com- 
paring output from the simulations using skewed and symmetrical 
data. The use of skewed data may make as much as 4.38 per cent 
difference. (See Table 4.6.) The number of times the single 
figure multivariable predictions fell within the various 
probability levels of the distributions using skewed data 
showed that single figures fell 67 times within the 0.20-0.29 
figures and 55 times within the 0.30-0.39 figures. All of the 
single figure predictions fell within the 0.10 and 0.69 figures. 
The differences in these results and those produced by the 
symmetrical data reflect the larger enrollment figures pro- 
duced by the skewed data. The impact of the {skewed data seems 
to be large enough to suggest that other statistical models should 
be tried? the simulation developed in the present study employs 
beta distributions which are transformed into normal distributions 
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However, it should be noted that the impact of skewed data 
might hot be so great with ordinary use of the simulation. 
The high and low estimates were set arbitrarily by the 
investigator in this study. In the set of skewed data, 
all of the variables were skewed in the same direction; 
use of authentic high and low estimates would probably pro- 
duce a set of data in which some of the variables were not 
skewed and others were skewed in opposite directions. 
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Chapter V 

Summary and Conclusions 

The adequacy of enrollment predictions can have a sub- 
stantial influence on the quality of educational programs. 
Difficulties associated with making adequate predictions are 
the extent of unpredictability of the phenomena, the inaccuracy 
of the prediction methods, and the lack of adequate communication 
between the forecaster and the user of the forecasts. Indeed, 
it is the uncertainties associated with the forecast figures 
that may be the most difficult to communicate. The present study 
is an attempt to develop a systematic means by which a forecaster 
can assess and express the uncertainties involved in the pre- 
dictions. A method of preparing probability distributions of 
enrollment predictions was developed. The use of probability 
distributions has the advantage of presenting probabilistic in- 
formation in a manner which is not as unwieldly as lists proba- 
bilities for various contingencies. Another advantage is that 
it integrates probabilistic information into the numerical and 
graphical presentation of predictions, making the message of un- 
predictability more forceful than in the words of warning often 
tucked into footnotes after pages of numbers and tables. 



o 

ERIC 



139 



- 134 - 



In the present study a basic method for single figure 
predictions was modified to accommodate probabilities. The 
multivariable method was chosen for this purpose since it 
allows the forecaster the freedom to deviate from simple 
projections and to make probability statements about individual 
variables. Monte Carlo computer simulation was chosen as the 
means for combining the probabilistic data to form the 
probability distributions. 

Significance tests were performed to investigate predictive 
validity of the prediction method and reliability and concurrent 
validity of the simulation output. A comparison of the 
predictive validity of the multivariable method with that of 
the percentage of survival method in one school system showed 
greater accuracy for the multivariable method. However, other 
studies should be conducted to determine whether or not these 
results can be replicated and to compare the multivariable 
method with methods other than the percentage of survival 
method. The tests of reliability and concurrent validity 
showed evidence of undesired noise effects, but ^there is • 
evidence to show that the effects of lack of reliability and 
lack of agreement between simulated and nonsimulated results 
are actually minimal. However, the lack of concurrent 
validity as measured by the effect of skewed input may be 
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large enough to justify reconsideration of the statistical 
distributions used to describe the variable estimates. 

This can best be determined by using data in which the 
high and low estimates are not arbitrarily set, as they were 
in the present study, but ate actual estimates of probabilities 
by persons with sufficient information to make such estimates. 

The major contribution of the study was not the infor- 
mation gained from testing the hypotheses, but the development 
of a method of predicting school enrollments in terms of 
probabilities. The development involved the choice of the 
multivariable method as the prediction model and the choice 
of computer simulation as the method of combining probabilistic 
estimates; these choices were made after reviewing the literature 
on enrollment prediction, population prediction, computer 
simulation, and other methods of handling estimates of proba- 
bilities. Other decisions that had to be made in the course 
of the development were the exact specifications of the 
multivariable model, the type of output to be produced, the 
type of input to be required, the form of the distributions 
for the probability estimates in the input, the development 
of the computer program, and the kinds of statistical tests 
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to be performed on the output. The outcome and rationale 
for these decisions are explained in Chapters III and IV. 

Other actions could have been taken at many of these decision 
points, but the present investigator felt that it would be 
more useful to prepare tone complete model that could begin 
to be used than to conduct separate studies for each of 
the decisions that had to be made. 

As a result of the developmental process, the computer 
programs and instructions are ready for use although it is 
suggested that they be employed with a heuristic approach, 
seeking more information about such factors as predictive 
validity. The development of the model and its use with 
trial data served to illustrate possible problems with the 
model. A question which was raised by the results of the 
simulation was that of independence of the variables. 

The lack of independence among the variables is one 
explanation for the fact that approximately 40 percent 
of the actual Brockton enrollments fell outside the 0.05 
and 0.95 probability levels. (See Table 5.1, p.132.) 

One should remember that the distributions produced by the 
simulations are dependent upon arbitrarily chosen high and low 
estimates; those estimates were, in general, chosen to be plus 
and minus three standard deviations from the mean for the 



- 137 - 



TABLE 5.1 



NUMBER OF TIMES THE ACTUAL ENROLLMENTS EXCEEDED THE 
.05 AND .95 PROBABILITY POINTS OF THE PREDICTIONS 



YEAR 


RANDOM #1 


RANDOM #2 


1964 


8 


7 


1965 


4 


4 


1966 


3 


3 


1967 


6 


7 


1968 


4 


5 


1969 


3 


3 
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previous few years. If the variables are truly independent, 
this indicates a large departure from past trends in Brockton. 
Alternatively, it could be hypothesized that the arbitrarily 
set high and low estimates were not inadequate, but that the 
assumption of independence constricted the simulation dis- 
tributions. Even if the variables are adequately independent, 
there is probably dependence among the years and grades, 
i.e., migration rates in one grade or year probably correlate 
with migration rates in other grades or years, but the model 
requires an assumption of independence even across grades and 
years for the same variable. 

Perhaps the most serious problems with the simulation 
model in its present form are this assumption of independence 
and the assumption of a beta distribution of estimates with 
the subsequent transformation into a normal distribution. Perhaps 
the most serious question about the attempt to use probability 
estimates in enrollment prediction is whether or not adequate 
probability estimates can be made. More research needs to be 
done on the basic problem of applying probabilities to enroll- 
ment predictions: both enrollment prediction methods and 

methods of estimating probabilities need to be validated and 
perfected. Since a primary purpose of the use of probability 
distributions of enrollment is to aid communication between the 
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forecaster and the user, studies of the effectiveness of 
various formats of presentation need to be conducted. 

Until such advances are made, the present study provides a 
simulation model for enrollment prediction whose use is not 
discouraged if the user keeps in mind the reservations stated 
by the investigator. Its use or the use of a modification 
of the model is encouraged if the user plans to take advantage 
of the opportunity to add to the research and validation data 
and to suggest improvements . 

The model developed here is considered a prototype of 
models which could be developed. One might want to predict 
the enrollment figures for a state as a whole or for 
individual schools within a district. One might want pre- 
dictions separately by race or scioeconomic class. An un- 
graded school might need predictions based on categories other 
than grade level, perhaps categories designating progress in 
relation to curriculum goals, so that demands on specialized 
teachers, equipment, and facilities might be anticipated. A 
unique contribution of the prototype for enrollment prediction 
is that it requires the user to examine the various parts of 
the system, to assess probabilities for these parts, and to 
use these probabilities to determine probabilistic information 
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about the operation of the system as a whole. It is the 
demonstration of this methodology in a workable model 
for computer simulation which gives this study its 
significance. 
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APPENDIX A 
Program MAIN 



0001 



0002 

0003 

0004 

0005 

0006 
0007 
0003 

0009 

0010 
0011 
0012 
0017 

0014 

0015 

0016 

0017 

0018 

0019 

0020 

0021 

0022 

002 ? 



OI M ENS ION TPROJUOO), BPROJ(IOO) , GPROJdOO), ENROLL (2 t 13 It 
2BJRTHS(2,5) . DEATHS!?, 5), ABI RTH ( 6 , 1 1 ) , 88 1 R THI 6, 1 1) , CR IRTH( 6, 1 1 » , 
30P 1 R TH ( fc , 1 1 ) » X8 1 P. THl 6 , 1 1 1 , $81 RTH(6 till* AG IRI.SI 6, 1 1 ) , 

40 G I RLS(6ill)t CGI RLS( 6,11), XGI RLS ( 6 , 1 1 ) , SGI RLS( 6, 1 1 ) , 

5ADEA TH ( 11>» BnEUHUltf C 0 EATH(ll», ODFATHUll, XOEATHdll, 

6SDE A TH ( 11) t APREMI (151* BPRE V !(151, CPRE'A I ( 15 ), l)PREH((15), 

7XPRE M I d 5) » SPRFMIC151. AMIGRAC 1 3 , 15 1 , RMI GR A( 1 3, 15 ) , 

SCMIGRAf 13,15) » OMIGRA ( 1 3 , 1 5) , XMl GR Ml ?, 1 5 I , SMIGRAl 13, 15) » 

OAPRI VS(15) , 3PRIVSI15), CPRIVSI15), RPR IVS (15), XPRIVSI 15) , 

AS PR I VS ( 15), ATR ANSI 1 3 , 1 5 ) , 8TR ANS ( 1 3 , 1 5 ) , CTRANSI 13, 15) , 

BOTR ANS (17,15), XTRANS ( 1 3 , 15 ) , STR ANS I 13 , 1 5 ) , AtlOL DS 1 1 3, 15 ) , 
CBHOLOSI 13,15), CH0L0SI13, 15), RHOLRSI 13,15), XHOLDSI 13,15), 

OSMDL OS (13,15), AI NSTI (13,15), B I NSTI (13,15), C INST I ( 1 3, 15 ) , 

EDINSTI (13,15), XINSTI f 13,151, SINSTI ( 13, 15 ) , AOROPS ( 5 , 15) , 

FBn or IPS( 5,15), CRR0PS(5, 15) , RORQPS ( 5 , 15 ) , XDROPS ( 5, 15 ) , 

GSRR0PS(5,1 5) , T1 TLE( 20 ) , F0RMK20), F0RM2(20), F0RM3(20), 
HF0RM4(?C), FCRH5( 20) , FORM6(?0) , F0RM7I20), FORMHI20), F0RM9(20), 
IFQRM 10(20) , F0RM11I20), FQRM12(20), FRRM13(20), BIRTH (6), 

JRFTAINI 13,151 , RHOLOS(IOO) 

DIMENSION TPREVY( 13,100) , BPREVY ( 13 , IOC ) ,. TPRESYl 13, 100), 

?BPRESY( 13,100) , TOTBI R ( 6 ) , B0YBIR(6) 

READ 15, 1) TITLE 

REAO( 5,2) I YEAR, I GRADE , ISEX, IOATE, IX 

REAO( 5,3) FORMl , F0RM2 , F0RM3, F0RM4, FORMS, F0R.M6, F0RM7, F0RM8, 
2F0RM9, F0RM10, F0RM11, F0RM12, F0RM13 

1 FORMAT (20A4) 

2 FORMAT ( 8X, 12, 3X, 12, 4X, II, IX, 14, 1X» 19) 

3 FORMAT (20A4) 

IF ( I SEX .EO. 1) GO TO 444 

RFAD (5, FORMl) (ENROLL! 1 , J) , J = 1, I GRADE ) 

IF ( I GRADE .EQ. 12) LIMIT = 5 
IF ( I GRADE .EO. 13) LIMIT = 4 
RFAD 15, FORM?) (7 I RTHS 1 1 , J) , J = 1, LIMIT) 

READ (5,FORM3) (OE ATMS ( 1 , J ) , J * 1, LIMIT) 

I STOP * I YEAR - LIMIT 

RFAD (5, FORMAL ( ( ABI RTH ( I , J ) , BBIRTHt I * J) t C3IRTH(I,J), I = 1,6), 
?J = 1,1 STOP) 

RFAD (5,F0RM5) I ( AGI RLS ( I , J) , BG!RLS(I,J), CGIRLS(I,J), I* 1,6), 
2J= 1*1 STOP) 

PFAO (5,P0RM6) (AOEATH(J), BOEATH(J), COEATH(J), J * 1 , 1 STOP ) 

RFAO (5.FORM7) (A.PREMMJ), B p RFMI(J), CPREMI(J), J = 1 , 1 YEAR ) 

READ 15, FORM?) ( ( AM IGR A ( I , J) , BMIGRAf !, J) , CMIGRA(I.J), 

?! = 2, IGRAOlf) , J * 1 , 1 YE AR ) 

R E AO (5,F0RM9) (APRtVS(J), BP«IVS(J), CPRIVS(J), J = 1 , 1 YEAR ) 

READ ( 5,F0R V 10) ( ( ATRANS ( I , J) , «T R ANS(I,J), CTPANSd.J), 

2! = 2 , I GRADE ) , J = 1 , 1 YEAR) 

READ ( 5, FC R Ml 1 ) ( ( AHOLDS ( I , J) , 8 HOLDS < I » J ) , CHOLOSII.J), 

21 = 1,1 GRADE ) , J = 1 , 1 YEAR) 
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Program MAIN (continued) 



0024 

0025 

0026 

0027 

0028 
0029 
0000 
0001 
003? 
0033 

0004 

0035 

0036 

0037 

0038 

0039 

0040 

0041 

0042 



0043 



0044 



C045 

0046 

0047 
CC49 

0049 

0050 

0051 



C 

C 



R - 4n I5,FQPM1?) ((AINSTMI.JI, BJNSTIU.JI, CINST I ( I , J ) , 

21=2, IGRADE I , J*I * I YF AR ) 

REAO I 5 , F OR M 1 3 1 (( AOROPS ( I , J ) , BOROPS 1 1 , J I , CDR0PSII,J1, I =1,51, 
2J * 1 « 1 YEAR ) * ’ 

GO TO 22? 

444 RE AO | 5 ,FQ9 Ml ) 1 (FNROLL 1 1 , J) , 1= 1,2), J = 1.IGRA0EI 
IF tlGRAOF .PC. 12) LIMIT = 5 
IF ( I GRADE .FC. 131 LIMIT = 4 

REAO(5,F0RM2) ((BIRTHS! I, J) , I = 1,2), J = 1, LIMIT) 

REA0(5,FQPM3) I (DEATHS* l , J) , I = 1,2), J = 1, LIMIT) 

ISTOP = 1YEAR - LIMIT 

P.FAO ( 5 , F OP M4 I I ( A8IRTH( I , J) , BRIRTH(I,J), CBIRTH(I.J). 

2DRIRTHI I, J I , I = 1,6), J = 1,1 STOP) 

RCAr> | 3.F0MM? I ( ( AGIRl S( I , J) , 3GIRLS I I ,,JI , CGIRLS ( I , J ) , I = 1,6), 
t J = 1 9 I 5 TOP I 

READ (5,FQRM6) (AOEATH(J), RDEATH(J), C DEATH! J), DDEATH(J), 

?J = 1,1 STOP) 

READ (5.F0RM7) (APREMI(J), BPREMI(J), CPREMIIJ), DPREMI(J), 

? J= 1 9 1 Y£AP ) 



READ I 5, FOR MR ) ( I AM IGRA ( I , J) , RMIGRA(I,J), CMIGRAII.J), 

20KIGRAI I, J ) , I * 2, IGRADE ) , J = 1,1 YEAR ) 

READ I5.F0RM9) IAPRIVSIJ), B°RJVSIJI, CPRIVS(J), DPRIVSCJ), 

P J = 1 « I YE A R ) 

READ l c , FORMIC) C I ATRANSI I , J) , PTRANSII,J), CTRANS(I,J), 

20TR A\'S( I , J ) , I *2 , IGRADE ) , J = 1 , IYEAR) 

READ ( 5, FOR VI 1 ) ( ( AHOL DS 1 1 , J ) , BHOLOSII.J), CHOLDS(I,J), 

PDHOLDSC I , J I , I = 1, IGRADE), J = 1, IYEAR) 

REA) ( 5 , F OR V 1 2 ) II A I NS Till, J), BINS Till, J), C INST 1 C I , J I v 
2DINSTI ( I , J) , I = ?, IGRADE), J = 1, IYEAR) 

REA) I 4, FORM 13) ( I AOROPS II , J ) , BDROPSII.J), CDROPS(ItJ), 
2ODR0PSII ,J) , 1*1,5), J»l, IYEAR) 

THIS °ART (IF THE PROGRAM PRINTS INPUT AS A CHFCK, COMPUTES AND 
PRINTS M F ANS AN) STANDARD DEVIATIONS. 

22? WR1 TC I 6, 4) TITLE, IYEAR, I GRADE , ISEX, IOATF, IX, FORM 1 , FOR M2, 
2F0RV3, F0RV4, F0RM5, F0RM6 , F0RM7, F 0 D MB , F0RM9, F0RM10, F0RM11, 
3P0RM12, F OR Ml 3 



4 FORMAT I 1 H 1 .2CA4//1X, ' NUMBER OF YF APS » , 5X , I ?/lX, 

? • NUMBER OF GRADES' , 5 X,I?/ 1 X, *SFX 0 *T I IN » , 5 X , 1 1 / 1 X, 

3* REG I NN1NG YF AR » , 5X , 1 4/1 X , • SEED* , 5V , 1 9/ /IX , • INPUT FORK AT S' // 

4 I 1 X, 2044)1 

ir iisfx .no. n go to 13131 

WRITE 16 , 5 ) I J, ENROLL 1 1 , J I , J = 1 , IGRADE) 

5 FTOMAT 1 1 H 1 ,' ENROLLMENT RY GRAOF IN RASE YEAR' // ( IX, I 2 , 5 X, FI 0 . 01 ) 
W° 1 TF 1^.61 I J, MIRTHS! 1 ,J » , J = 1 , LIMIT) 

6 FORMAT ( 1 H 1 , 'HI STOR ICAL P 1 RTHS RY YEAR OF S I MUI AT I ON ’ / / 

21 IX, 1 2 , 5 X,F 1 O.CI) 



WRITE 16,7) I J , Of. A THS 1 1 , J ) , J = I, LIMIT) 

7 FORMAT I 1H1 ,» PRESCHOOL DEATHS 3Y YEAR OF SIMULATION'// 
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005? 

0053 

0954 

0955 

0956 

0957 

0058 

0059 
0069 
0061 
0062 
0063 



0064 



0065 

0066 
9067 
0068 

0069 

0070 

0071 

0072 

0073 

0074 
0*75 
0976 
0077 
0978 

0070 

0030 

0031 



0082 
0983 
00 34 
00 35 
0036 
0C87 



21 IX, I2,5X,F19.0IJ 
GO TO 1313 

13131 WR! TE ( 6, 505) (J, ENROLL ( 1 , J J , ENROLL I ?, J ) , J * l.IGPAOE) 

.505 FOR V M I1H1 , -ENROLLMENT BY GRAOE IN BASF YE AR- // 1 3X, • TOTAL ', 10X. 
?'*PYS'//M X,I2,?(5X,F10.0) )) ’ L ’ 1U *’ 

WR I TE ( 6 , 6061 (J, BIRTMSU.J), BIRTHS ( 2 , J ) , J = i, LIMIT) 

606 FQ°MAT ( 1H 1 , 'H I STORI C AL BIRTHS BY YEAR OF S I MUL AT I ON • //9X, ' TOT AL • • 
25X,»ROYS'//llX,I2,2(5X,FiO.O) )) 

WPITc (6,797) (J, f)E ATH S ( 1 , J ) , 0EATH$(2,J), J = 1, LIMIT) 

707 FORMAT (1H 1 ,- PRESCHOOL HEATHS BY YEAR OF SIMULATION-// 

?1 4X, -TOTAL • , l OX, -BOYS'// I 1 X , 1 2 , 2 ( 5X ,F10 .0 ) ) ) 

1313 WRITF (6,8) 

9 FORMAT ( 1H0, 'PREDICTIONS FOR BIRTH RATE BY AGE GROUP'/) 

IF ( I SF X . FQ. 1) WRITE (6,909) 

IF ( I SEX .EQ. 0) WRITE (6,9) 

909 FORMAT ( 1H , 'YEAR • ,2X ,' L EVEL ', 2X, 'HIGH EST I MATF • , 3X, • L IKELY ESTIMA 
3T10N 3 30YSV) FSTIMATE,,5X ’ ,MFAN ' ,5X, ' STAN0APD DEV,AT,on, »5X,'PROPOR 

9 FORMAT (1H ,'YFAR' ,2X, 'LEVEL* ,2X, 'HIGH ES T I MATE' , 3X, 

? ' L I KEL Y ESTIMATE' ,3X, 'LOW EST I MATE • ,5X , • MEAN' , 5X, 

3 ' STANDARD DFVIATION'/) 

DO 409 J = 1 , 1 STOP 
DO 400 I * 1,6 

S ( AB IRTH ( I , J ) + 4. * BRI RTH( I , J ) ♦ CB IRTH( I , J ) ) /6.0 
SP|«TH(I,J) = ( ARIRTHI I , J ) - CR1RTH( I, J ) ) / 6 .0 

Jo.JISf? 11 wR ' t E<6,404) J, I, ABIRTHI I , J ) , 93|RTH(I,J|, 

2CP I RTH( I , J I , XBIRTH(I,J), SBIRTH(I.J), OBIPTH(I,J) 

* F0 * 0) WRI TE ( 6 , 49 ) J, I, ABI RTH ( I , J ) , BRIRTHH.J), 
?CB|RTH( I , J) , XBIRTHII , J ) , SBIRTHJI.J) 

4C4 FORMAT (1H , 1 2 ,5X , 1 2 , 3X , 3 ( FI 0, 1 ,5X ) ,F 10. ? , 5X , F l Oi 2 , 15X , F6. 3) 

,,? ' 5X ' ,2 - 3X1 3,Flc - l - 5xl ' 

WRITF (6,51) 

51 FORMAT ( 1H0, 'PREDICTIONS FOR FEMALES BY AGE GROUP'/) 

WPITF (6,9) " 

DC 500 J = 1,1 STOP 
DC 5C9 I * 1,6 

= !iS!5, L ! !;*'!! * 4 * * PGIPLS 'l'J> «• CGIPLS( Ir Jl )/6.0 
so, IRISH, J) = ( AGIRLS ( I , J ) - CGI HS ( I , J ) ) /6 .0 

a'PITC (6,40) J, I ,A9IRLS ( I , J) , 3G!RLS(!«J), CGIRLS(I,J), 

2XG I R L S ( I , J ) , SSIRLS ( I , J ) 

500 CONTINUE 

WRITE (6,61) 

61 FORMAT C lHCVPRFKCTIONS OF PERCENT AGES OF PRESCHOOL OEATH c ' /) 

I c (ISEX , F9. 1) WRI TE (6,1010 " 

IF ( ISEX .E9. 0) WRITE (6,1 9) 

lriC ^rn! <A Icl!'J EST I MATE' ,3X, ' LI KELY .FST IMATF' ,3X, 

2 LOW E ST I MATc • ,5X , 'MC AN' ,5X , • ST ANDARO DEVI AT I JN' , 5X, 
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0088 

0089 

0090 

0091 

0092 

0093 

00 9A 

0095 

0096 

0097 

0098 

0099 

0100 
0101 
0102 
0103 
010A 

0105 

0106 

0107 

0108 
0199 

one 
0111 
0112 
0113 
Oil A 

0115 

0116 

0117 

0118 

0119 

0120 
0121 

0122 

0123 

0I2A 

0125 

0126 



3»PROPORTION BOYS'/) „„ 

10 FORMAT C1H ,’ YEAR ', 2X , 'HIGH EST IMATE* , 3X., ’LIKELY EST IMATE’ * 3X» 
2’LMW ESTI M ATE*, 5X,’ ME AN', 6X, ’STANDARD DEV 1 AT I ON*/ ) 

DO 600 J * 1. 1 STOP 

XO c ATH ( J 1 * (AOEATH(J) ♦ A.*BOE AT H( J) ♦ CDE ATH( J ) 1 /6 . 0 
SOEATH(J) = ( ADEATH( J) - CDE ATH ( J )) /6 . 0 

IF ( I SEX .EO. 1) WRITE ( 6,6060 1 J. AOEATH(J)» BOEATH ( J ) « CDEATH(J)» 
2XDEATHU), SOF A TH ( J ) , DOE ATH ( J ) 

IF (ISEX .FO. 0) WRI TF(6»60) J» AOGATH(J), BOEATH ( J I « COEATH(J), 
2X0FATHI J) » SOEATHIJ) 

6060 FORMAT I1H , I ? « 5( 5 X ,F10. 7) * 12X , F6 .3 ) 

60 FORMAT I 1H , 12 ,51 5X , F10.7) ) 

600 CONTINUE 

WPITF (6,71) x 

71 FORMAT ( 1H0, • PREDICTIONS OF PRESCHOOL MIGRATION’/) 

IF (ISFX .EQ. 1) WRI TE16, 1010) 

IF (ISEX .FO. 0) WRI TE (6 , 10) 

DO 709 J = 1 , I YEAR 

XPRFYI(J) * (APREMI(J) + A. * RPREMI(J) ♦ CPREM I ( J ) ) /6 .0 

SPRRMI(J) = (APREMKJI - CPREMI ( J ) 1/6.0 

IF (ISEX .FO. 1) WRI TE(6,7070) J, APREMI(J), BPREMI(J), 

2CPREPKJ), X°REMl ( J) , SPREMl(J), OPREMI(J) 

IF (ISFX .FQ. 01 WRI TE (6 ,70) J, APREMKJ), 8°REMI(J), CPPEMl(J), 
?XP°E M I ( J ) , SPREMI(J) 

7070 FORMAT ( 1H , I?, 3( 5X, FlO.l ) ,2 (5X »F 10.2) » 12X»F6.3) 

70 FORMAT ( 1H , I 2 , 3 ( 5X , F 10 . 1 ) , 2 ( 5X , F 1 0 .2 ) ) 

7C0 FONT I NIJF 

WRITE (6,81) 

81 P()3«»T ( 1H0, ’PREDICTI ONS OF NET MIGRATION BY GRADE’/) 

IF (ISEX .FO. II WRITE(6,909) 

IF (ISEX .EC. 0) WRITE (6,9) 

00 800 J = 1,1 YEAR 
r)n 8QC I = 2 , IGRADE 

X»ir,RA (I,J) = (AMIGRA(I,J) ♦ A. * BMIGRA(I«J) ♦ CM! GRA( I , J ) )/6.0 
SMIGRA ( I , J ) = ( AMIGRAt I , J) - CM IGRA ( I , J I) /6.0 

IF (ISFX .FO. II WRI TF ( 6 1 AOA ) J, I, AMIGRA(I,JI, BM l GP. A ( I , J ) , 
2CMIGP. A ( I , J ) , XMIGRA(I,J), SMIGRA(1,J), 0MIGRA(I,J) 

IF (ISEX .60. 0) WRI TE(6,A0) J, I, AMIGRA(I,JI, BMIGRA(I,J», 
?C v IGRA(ItJ), XMIGR A ( I , J) , SMIGRA(1,J) 

BOO CONTINUE 

WRITE (6,911 

91 C (,1RMAT ( 1H0, ’PREDICTIONS OF POTFNTIAL LEVEL 1 STUDENTS ENROLLING 
?IN NON-PUBLIC SCHOOLS'/) 

IF (ISEX .EO. 1) WRI TE(6,I010» 

IF (ISCX .FO. 01 WRITE (6,10) 

00 900 J = 1,1 YEAR 

XPPIVS(J) = (APRIVS(J) ♦ A. * BPRIVS(J) * CPR I VS ( J ) ) /6. 0 
SPRIVS(J) = (APRIVS(J) - CPRlVS ( J I )/6.0 
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0127 

0128 
0129 

ono 

0131 

0132 

0133 

0134 

0135 

0136 

0137 

0138 

0139 

0140 

0141 

0142 

0143 

0144 

0145 

0146 
01 *7 

0148 

0149 

0150 

0151 

0152 

0153 

0154 

0155 

0156 

0157 

0158 

0159 

0160 
0161 
0162 

0163 

0164 



IP C I SEX .EQ. 11 WRITE (6,7070) J, APRIVS(J), BPRIVS(J), 
2CPPIVSCJ), XPPIVS(J), SPRIVS(J), DPRIVS(J). 

IF ( I SFX .EQ. 0) WRITE (6,70) J, APRIVS(J), 3PPIVSU), CPRIVS(J), 
?XPRIVS(J), SPRI VS( Jl 
900 CONTINUE 

WRITE (6,101) 

101 FORMAT ( 1H0, ‘PREDICTIONS OF TRANSFERS TO/FROM NON-PUBLIC SCHOOLS 
2BY GRADE'/) 

IF ( I SEX .FQ. 1) WRITE (6,909) 

IF ( ISEX .FQ. 0) WRITE (6,9) 

DO 1000 J = 1 »! YEAR 
DO 1000 I = 2, IGRADE 

' XTRANS ( I , J) = ( ATRANS ( I , J i ♦ 4. * BTRANS(I,J) ♦ CTRANS (I , J ) ) /6.0 
STPANS(I,J) = ( ATRANS ( I , J 1 - CTP ANS ( I , J ) ) /6 . 0 
IF (ISEX .EQ. 1) WRITE (6,404) J, I, ATR ANS ( I , J ) , 8TRANS(I,J), 

2C TRAMS ( I , J ) , XTRANSl I , J ) , STRANS( I , J) , DTRANS(I,J) 

IF (ISFX .EQ. 0) WRITE (6,40) J, I, ATRANS(I.J), BTRANS(I,J), 

2C TRANS (I ,J) , XTRANSII, Jl , Sl'RANS ( I , J) 

1000 CONTINUE 

WRITE (6,111) 

1.11 FORMAT ( 1H0, ‘PREDICTIONS OF PERCENTAGES RETAINED IN EACH GRADE*/) 
IF (ISFX .EQ. 1) WRITE (6,909) 

IF (ISEX .EQ. 0) WRITE (6,9) 

DO 1100 J = 1,1 YEAR 
DO 1!00 I = 1, IGRADE 

X HOI. DS ( I , J ) = ( AHQLDS ( I , J ) ♦ 4. * BHOLDS(!,J) ♦ CHOLOS ( I , J ) ) /6. 0 
SHOLnS(J.J) = ( AHOLDS ( I , J ) - CHOLDSI I , J ) ) /6 . 0 
IF (ISEX .EQ. 1) WRITE (6,5656) J, I, AHOL DS ( I , J ) , 

28HOLOSU ,J), CHOLOS ( I , J ) , XHOLDS(I,J), SHOLDS(I,J), DHOLDS(I,J) 

IP (ISFX .EQ. 0) WRITE (6,110) J, I, AHOLDS (!• J )• 

2?I‘0LDS( I , J) , CHOLDSI I , J ) , XHPLDS(I,J), SHOLDS(I,J) 

5656 FORMAT (1H , 1 2,5X , I 2 , 5 ( 5X , FI 0. 7 ) , 12X,F6. 3 1 
110 C 0R MAT (1H , 12, 5X , 12, 5(5X, F10.7I) 

1100 CONTINUE 

WRITE (6,121) 

121 FORMAT ( 1HC, ‘PREDICTIONS OF PERCENTAGE LOSS BY GRADE BECAUSE OF 
2DEATH OR INSTITUTIONALIZATION*/) 

IF (ISEX .EQ. II WRITE (6,909) 

IF (ISFX .EQ. 0) WRITE (6,9) 

DO 120C J = l.IYEAR 
00 1200 I = 2, IGRADE 

XINSTI ( I , J ) = ( AINST I ( I , J 1 ♦ 4. # B1NSTIU,J) ♦ C INST I ( I , J 1 1 /6 .0 
SINSTI ( I , J ) = ( A I NSTI ( I , J ) - BINST1 (I ,J) 1/6.0 

IF (ISEX .EQ. 1) WRITE (6,5656) J, I, A INST I ( I , J ) , RINSTI(I,J), 
2CINSTI (I , J ) , XI NSTI ( I , J I , SINSTKl.JI, 0(NSTI(I,J) 

IP (ISFX .EO. 0) WRITE (6,110) J, I, AINSTI BINSTI ( I , J I , 
2CINSTI ( I ,J) , XINSTI ( 2 , J ) , SINSTKl.JI 
12C0 CONTINUE 
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0165 

0166 

0167 

0166 

0166 

0170 

0171 

0172 

0173 



01 74 

0175 



0176 

0177 

0178 

0179 

0180 
0181 
018? 
0183 
018** 

0195 

0196 

0197 

0189 
0199 

0190 

0191 

0192 

0193 
0104 

0195 

0196 

0197 

0198 

0199 

0200 

0201 

0202 

0203 

0204 



WRITE (6.131) 

131 PnPMAT ( 1HC, 'PBEOICTIONS OF PERCENTAGE OF DROPOUTS, GRADES 7 - 11' 
2/) 

IF C I SEX .EQ. II WRITE (6,909) 

IF ( I SEX .EQ. 0) WRITE (6, 9) 

00 1300 J = 1,1 YEAR 
00 1300 I = 1,5 

XOROPSIIiJ) = ( AOROPS ( I , J ) + 4. * BOROPS(I.J) + COROPS ( I , J ) ) /6 .0 
SO=>OPS(l.J) = ( A0R0PS(I,JI - COPOPS ( I • J ) 1/6.0 
IF ( I SFX .EQ. 1) WRITE (6,5656) J, I, AOPOPSII.J), 

230°OPS ( I i J ) , CORO p S ( I , J ) , XOROPS ( I , J ) , SORCPSII.J), 

3D0RnPS(I,J) 

IF ( I SEX .FQ. 0) WRITE (6,110 J, I, AOROPS(I.J), BOROPS(I.J), 
2C0RQPSU ,J) , XOROPS ( I .« J ) , SOROPS(I,J) 

1 3C0 CONTINUE 

C THIS PART QF PROGRAM BEGINS ITERATIONS AND PROJECTS FIRST GRAOE 

C (OR KINDERGARTEN) ENROLLMENT FOR THE FIRST 5 ( OR 4) YEARS OF THE 

C SIMULATION 
AM = 0.0 
S = 1.0 

00 1239 I = 1 ,1000 
1239 CALL GAUSS (IX, S, AM, V) 

00 lllll J = 1 , 1 YE AR 
00 22222 1 = 1 , IGRAOE 
IF ( I .GT. 1 ) GO TO 33333 
IF (J .GT. LIMIT) GO TO 94499 
00 10001 M = 1 ,100 

TPPOJ(M) = BIRTHSd.J) - 0EATHS(1,J) 

CALL GAUSS ( I X , SPRFMI ( J) ,XPREMI ( J ) , VI ) 

TPROJ(M) = TPPOJ(M) + VI 

CALL GAUSS ( IX,SPRIVS( J) .XPRIVS (J ), V2) 

TPROJ(M) = TPROJ(M) - V2 

CALL GAUSS! IX, SHOLOS( I , J) , XHOLDS ( I , J ) , V3 ) 

RHOLOS(M) = V3 

IF (J .EO. II T PRO J ( Ml = TPROJ(M) ♦ V3 * ENRPLL(i,l) 

IF IJ .GT. 1) TPROJ(M) = TPROJ(M) ♦ V3 * TPRFVY(l.M) 

TPRCSY(1,M) = TPROJ(M) 

IF ( I S«=X .EO. 0) GO TO 10001 
RPPOJ(M) = R I R THS ( 2 • J ) - OF ATHS ( 2 , J ) 

BPRHJf M ) = BPROJ(M) + VI * l)PREMI(J) 

RPRf)J(M) = BPROJ(M) - V2 * OPRIVS(J) 

IF (J .EC). 1) BPROJ(M) = ftPROJ ( M ) ♦ V3 * OHOLOSd.J) * ENR0LL(2,1) 

IF (J .GT. 1) RPROJ(M) = RPROJ(«) ♦ V3 * OHOLOSd.J) * 

2RPR r VY ( 1 , M) 

HPRFSYJ 1 ,V) -= BPROJ(M) 

10001 CONTINUE 

IF ( ISEX .EO. 0) GO TO 98777 
DU 77777 M = 1,100 
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0205 




GPROJIM) = TPROJ(M) - BPROJ(M) 


0206 


77777 


CONTINUE 


0207 


93777 


ICOUNT = 1 


0208 




CALL OUT t>U T (1, J, TPR OJ, I COUNT, I GRADE, I DATE) 


0209 




IF CISEX .EQ. 0) GO TO 22222 


0210 




ICOUNT = 2 


0211 




CALL OUTPUT (I ,J,8PR0J, ICOUNT, I GRADE, I OAT El 


0212 




ICO'JNT * 3 


0213 




CALL OUTPUT (I ,J,GPROJ,I COUNT , IGRADE, IDATEI 


02 1 A 




GO TO 22222 




C 


THIS PART GE PROGRAM PROJECTS FIRST GRAOE I OR KINDERGARTEN 




C 


ENROLLMENT FOR THE REMAINING YEARS OF THE SIMULATION 


0215 


44444 


DO 20C02 M = 1,100 


0216 




N = J - LIMIT 


0217 




TOTALI = 0.0 


0218 




T0TAL2 = 0.0 


0219 




DO 19 L ■ 1,6 


0220 




CALL GAUSS! IX, SBIRTHI L,N) , XB I RTHIL, NJ , V 1 


0221 




VV = V 


0222 




CALL GAUSS (IX,SGIRLS(L,N) ,XGIRLS(L,N),V) 


0223 




T0T8IRIL1 = VV * V / 1000.00 


0224 




TOTALI = TOTAU ♦ T0TBIRIL1 


0225 




IF l ISFX *G0. 0) GO TO 19 


0226 




BOYBIPILl = VV * Dpi RTH I L, Nl * V / 1000.00 


0227 




T0TAL2 = T0TAL2 ♦ B0YBIR(L1 


0228 


19 


CONTINUE 


0229 




CALL GAUSS (IX, SDCATH(N), XDE ATH(NI , Vll 


0230 




TPR? J I Ml = TOTALI - VI * TOTALI 


0231 




CALL GAUSS ( I X, SPRFMIIJ), XPRCMKJ), V2I 


0232 




1PR0JIV) = TPP.OJ(M) f V2 


0233 




CALL GAUSS ( I X , SPRIVS(J), XPRI VS! J) ,V3I 


0234 




TP-<nj(“l = TPRDJ(M) - V3 


0235 




CALL GAUSS! IX, SMOLOS(1,J) , XHOLDSd.J), V81 


0236 




RMOLDS ( M ) = V8 


0237 




TPRDJIM) = TPPOJIM) ♦ V8 * TPREVY(1,MI 


0238 




TP°E SY ( 1 , M 1 = TPRO J ! Ml 


0239 




IF (ISFX .EO. 0) GO TO 20002 


0280 




poRQjj mj =• T0TAL2 - VI * T0TAL2 


0281 




RPROJ I = BPPOJ! M) f V2 * DPRE M I I J 1 


0282 




QP9Q JIM) = opROJ(M) - V3 * OPRIVS(J) 


0283 




P°RO J ( M 1 = BPROJ(M) ♦ V8 * DHOL OS (1 , J ) * BPREVY(1,M1 


0284 




PPRCSY! 1 ,M| = BPROJIMI 


0285 


2000? 


CONTINUE 


0286 




IF (ISEX. EC. 0) GO TO 98768 


0287 




00 70007 M = 1,100 


024B 




GPnnj|Mj = rosnjiMj - *PRoj(Mi 


0286 


70C07 


CONTI NUF 


02 50 


0*764 


ICOUNT = 1 
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0251 

0252 

0253 

0254 
02 55 
02 56 
0257 



0258 
02 59 
0260 
0261 
0262 

0263 

0264 

0265 

0266 

0267 

0268 

0269 

0270 
027) 
0272 
02 73 

0274 

0275 

0276 

0277 

0278 

0279 
02 80 
0?ftl 
023? 
0243 
0294 
0235 
0296 
0287 
02 ° 8 
028° 
0290 
029) 
029? 

0293 

0294 

0295 
0206 



CALL OUTPUT (I . J. T PRO J » I COUNT « I GR AOE. I DAT E 1 
IF ( I SFX . EQ. 0) GO TO 22222 
I COUNT = 2 

CALL OUTPUT (I . J, BPROJ , I COUNT. I GRAOE. 1 0 AT El 
I COUNT = 3 

CALL OUTPUT) 1 , J , GPROJ , ICOUNT , IGR ACF, I OAT E I 
GO TO ’2222 

C THIS PART OF PROGRAM PROJECTS ENROLLMENT FOR SFCONO IOR FIRST) 

C GRADE THROUGH GRADE 12 (OR 13) FOR THE FIRST YEAR OF SIMULATION 

3333? IF (J .GT. 1) GO TO 88888 
DO 30003 M = 1,100 
TPROJ(M) = ENROLL ( 1 * I — 1 ) 

CALL GAUSS! IX • SMI GRA ( I • 1 ) » XMI GRA(1 , 1 ) • VI ) 

TPROJI M ) = TPROJ(M) ♦ VI v 

CALL GAUSS (IX.STRANSC I,l).XTRANSd,l)»V2) 

PROJIM) = TPROJI M) ♦ V2 

TPROJ(M) - TPRCJ(M) - RHOLOS(M) * ENROLL I 1. 1-1) 

CALL GAUSSIIX. SHOLOSII.J). XHOLDSII.Jl. V3) 

RHOLDS(M) = V3 

TpROJ( v ) = TPROJ(M) ♦ V3 * ENROLL (1*1) 

CALL GAUSSCIX.SINSTI (I.l).XlNSTII I.1I.V4) 

. TPROJI M) * TPROJ(M) - V4 * ENROLL ( 1 » 1-1 ) 

IF ( I GRA OF .EO. 13 .ANO. I .GT. 8) GO TO 33 
IF ( IGRAPE .EO. 12 .ANO, I .GT. 7) GO TO 33 
GO TO 6061 

33 L = I - I GRADE ♦ 5 

CALL GAUSSdX.SOROPSI L.l) .XDROPS ( L . 1 ) . V5 ) 

TPROJ(M) = TPROJI M) - V5 * ENROLL ( 1 . I-i ) 

6C61 TPRESYU.M) = TPROJ(M) 

IF (ISFX .EO. 0) GO TO 30003 

8PRQJ( M ) = ENROLL ( 2 » I ”1) 

nPRIJJIM) = 8 PRO J I M ) ♦ Vl * 0MIGRA(I»1) 

BP9HJ(M) = BPROJIM) ♦ V2 * DTP ANS ( 1*1) 

HPRPJM) = RPPOJIM) - RHOLOS(M) * OHOLDSI I-l.J) * ENROLL 1 2 . I - 1 ) 
6 D P?J( M ) * P.PPOJIM) ♦ V3 * DHOLOSdtJ) * ENR0LL(2»I) 

HPROJ(^) = RPROJIM) — V4 * niNSTIdtl) * ENRQLLI2. 1-1 ) 

IF IIFRAOE .FG. 13 .AND. 1 .GT. 9) GO TO 333 

IF IIGRAOE .EO. 12 .ANO. 1 .GT. 7) GO TO 333 

GO tq 3333 .... 

333 OPRDJ( M ) = RPPOJIM) - V5 * OOROPS(Ltl) * ENROLL ( 2* I- 1 ) 

33?? B^RESYI I »M) = B D P.OJ(M) 

*0CC3 CONTINUE 

1 r ( ISFX .EO. P) GO TO 98765 

no 7CC77 M = 1 .100 

GPROJ(M) = TPROJI M ) - BPROJIM) 

70C7 7 CONTINUE 
93765 ICOUNT = 1 

CALL OUTPUT!! .J.TPROJ.I COUNT » I GRADE. I OAT E ) 



£L63 



O 

ERLC 



i 



- 158 - 



Program MAIN (continued) 



0297 

0298 

0299 

0300 

0301 

0302 



0303 


83P98 


no 400C4 M 


0304 




TPROJCU = 


0305 




CALL GAUSS ( 


0306 




TPRDJ(M) * 


0307 




CALL GAUSS 


0309 




TPRPJ(M) = 


0309 




T p KOJ(«) = 


0310 




CALL GAUSS 


0311 




RHDLOS(M) * 


0312 




TPROJ(M) = 


0313 




CALL GAUSS( 


0314 




TPROJ(M) = 


0315 




IF ( IGRAD C 


0316 




IP (I GRADE 


0317 




GO TO 205 


031 P 


14 


L « I - IG« 


0319 




CALL GAUSS 


0320 




TOrhjjm) = 


0321 


?C5 


TPRESYI l ,M) 


0322 




IF IISFX . F 


03?3 




BPRPJ(M) = 


0324 




DPPOJ(M) * 


0325 




BPRO J [ M .) = 


0326 




bp 5 oj(mi = 




•23 po E VY ( I - 1 » 


0327 




P PRO J ( M) = 


0328 




rppOJ(M) * 


0329 




IF (IGRADF 


0330 




IF ( IGRADF 


0331 




GO TO 4444 


033? 


44 




0333 


4444 


B°RE S Y ( I ,M) 


0334 


40004 


CONTINUE 


0335 




IP ( ISFX .E 


0336 




DO 77C07 M 


0337 




GPRP J ( '*) * 


our 


7(007 


CONTINUE 


0139 


919^9 


(COUNT = 1 


0340 




CALL OUTPUT 



IF ( ISCX .FQ. 01 CO TO 22222 
I COUNT = 2 

CALL OUTPUT (I , J, RPROJ , I COUNT « I"GRAOE, l OAT El 
I COUNT = 3 

CALL OUTPUT! I f J f GPROJ * I COUNT * I GRADE « I OAT E ) 

GO TO 2222? 

THIS PART OF PROGRAM PROJECTS ENPCLLMENT FOR SECOND (OR 
GRADE THROUGH GRADE 12 (OR 131 FOR THE REMAINING YEARS 
SIMULATION 

1 ,100 



TPPOJ(MJ ♦ VI 

(IX, STRANSU • J I ,XTRANS( I,J),V21 
TPROJHI «■ V2 

TPnOJ(M) - RHOLOS(M) * T PRFVY ( I- 1, HI 
(IX, SHOLOS ( I , Jl , XHOLOS 1 1 , J ) , V3) 



FIRST) 
OF THE 



.FQ. 

.EO. 



1? 

12 

♦ 5 



♦ V3 * TPREVY ( I ,M| 
I(I,J), X I NST I ( I , J ) , V4) 

- V4 « TPREVY(l-l,M) 
.AND. I .GT. 8) GO TO 34 
.ANO. I .GT. 71 GO TO 34 



( I X, SOROPSI L , J I » XOROPS (L , J ) , V5 ) 
TPP.OJ(M) - V5 * TPRCVY ( I — 1 , M ) 

= TPROJ(M) 

0. 0) GO TO 40004 
= 8PPEVY( 1-1 »M I 



9 PRO.) ( 
pppcijm 
RPROJIMI 
M) 

RPPOJ(M) 
8 PP.OJ ( M) 
.EO. 13 
.EO. 12 



♦ VI * DM I GR Al I , J I 

♦ V2 * OTR ANS 1 1 , J) 

- RHDLQSCM) * OHOLDSI I-1.JI 



♦ V3 
- V4 
.ANO. 
.ANO. 



DHOLOS ( I , J I * 
0 1 NST I ( I,J) * 
.GT. 8) GO TO 
.GT. 7) GO TO 



RPRFVY ( I » M I 
8PREVY (I — 1 , M I 
44 
44 



BPROJ(M) - V5 
= RPROJ(M) 



* DORCPS (L , J I * RPREVY ( I — 1»M) 



0. 0) GO 
= 1,100 



TO 98989 



- RPROJ(M) 



(I , J* T PRDJ , I COUNT , IGRADE,IOATEI 
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0341 

034? 

0343 

0344 

0345 

0346 

0347 

0348 

0349 

0350 

0351 

0352 

0353 

0354 
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Program MAIN (continued) 



IF IISEX .EQ. 01 GO TO 22222 
I COUNT = 2 

CALI OUTPUT ( I ♦ J ♦ 3PR0J ♦ I COUNT ♦ I GRAOEt I DAT E 1 
ICOUNT = 3 

CALL OUTPUT ( I » J* GPROJ ♦ ICCUNT ♦ I GRADE ♦ I DAT El 
22??2 CONTINUE 

DO 55555 M = ItlOO 
00 55555 I = 1 » IGRADE 
TPSFVYCI ,M) = TPRESVU »M1 
BPREVYC1 ,M1 = BPRESYI I ♦Ml 
55555 CONTINUE 
11111 CONTINUE 
STOP 

ENO v 
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APPENDIX B 

Subroutine OUTPUT 



9001 

0092 

0003 

0004 

0005 

0006 

0007 

0008 

0009 

00 10 
0011 

0012 

0013 

0014 

0015 

0016 

0017 

0018 

0019 

0020 
0021 
0022 



SUBROUTINE OUTPUT ( I ,J , PROJ, ICOUNT , IGRAOE, IDATE) 
DIMENSION PROJ(IOO) * ENPROJdl), PR08(11) 
OROFRING ROUTINE 
ISWCH * 0 
DO 1 M * 1,99 

IF ( PRO J ( M) .LE. PROJ ( M ♦ 1)) GO TO 1 

FINTER = PROJIMI 

PROJCMI = PROJIH ♦ II 

PPOJCM ♦ 1) = FINTER 

ISWCH - 1 

CONTINUE 

IF (ISWCH . FO. 1) GO TO 2 
ROUTINE FOR COMPUTING PROBABILITY LEVELS 
( PROJ (5 1 ♦ PROJ (6 I 1/2. 



ENPROJCll 
FNPRPJ ( 2 I 
FNPROJ (31 
ENPROJ ( 4 1 
ENRR0JC5I 
FNPROJ (61 
FNPROJ ( 71 
FNPROJ ( 8 I 
ENPROJ (91 
ENPROJt 10) 
ENPROJ (11) 



(PROJdOl ♦ 

■ (PR0J(20I ♦ 

• (PROJdOl ♦ 

• ( PR0J(40I ♦ 

( PROJ (50) ♦ 
(PR0J(60) ♦ 

> (PROJ (70 1 ♦ 
(PR0J(80I ♦ 

= ( PROJ (90) ♦ 

• (PR0J(95) ♦ 



PROJ (11)1/2. 
PROJ ( 2 1 1 1 /2 • 
PR0J(31I 1/2. 
PROJ (41) 1/2* 
PROJ (51) 1/2* 
PROJ (61)1/2* 
PROJ (71)1/2* 
PROJ(fll) 1/2. 
PROJ (91)) /2. 
PROJ ( 96) )/2. 



0023 


PPOB(l) = 


.05 










0924 


p ROB( 2) ■ 


.10 










0025- 


PROB ( 3 ) = 


.20 










0026 


PROB (4 ) = 


.30 








• 


0027 


PP.OB( 51 = 


.40 










0028 


PROB ( 6 ) = 


.50 










0029 


PROB( 7) = 


.60 










0030 


PROB (81 - 


.70 










0031 


PROB (9) = 


.80 










0032 


PROB(IO) = 


.90 










0033 


PROR(ll) * 


.95 










0034 


IF ( IGRAOE 


.EQ. 


12) 


II = I 






0035 


IF (IGRAOE 


.EG. 


13) 


II = I - 1 






0036 


JJ = J ♦ IOATE - 


1 










C PRINTING ROUTINE 










0037 


IF (ICOUNT 


. F 0. 


1) 


WRITE(6,700I 1 


I 


i , jj, ii* jj 


0038 


IF ( ICOUNT 


. EQ. 


2) 


WRI TE ( 6 * 7002 ) 


I 


i,jj,ii,jj 


0039 


IF (ICOUNT 


.FO. 


3) 


WRI TE ( 6 , 7003 ) 


1 


i,jj, ii ,jj 


0940 


7001 FORMAT (/////IHO 


, 'PROBABILITY THAT 


TPTAL LNPOLL MFNT 




212, IX, *IN» 


IX, IA, 


14X 


, 'PROBABILITY 


THAT TOTAL ENROL IM 



,'WILL BF LESS THAN 
BE GREATER THAN THE 



IN GRAOFMX, 
NT IN Go ADEN 
THE SPECIFIED PREOICTEO 
SRFC IF I Eli PREDICTED FNR 



3 1 X • 1 2* 1 X , • IN' , 1 X , I 4/1 H 
4 ENROLLMENT • ,14X, 'WILL 
5ULLMFN1 • // 1 H ,4X, 'PROBABILITY* , 5X , ' PREDICTED ENROL LMENT 30X ,* PROB 
6ABILITY* ,5X, 'PREOICTED ENROLLMENT*//) 
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0041 



0042 



0043 



0044 

0045 

0046 

0047 

0048 
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Subroutine OUTPUT (continued) 



700? FORMAT ( /////1H0, 'PROBABILITY THAT MALE ENROLLMENT IN GRADE ' , 1 X, 
?!2tlX, MN' ,1X,I4,15X,' PROPA0II ITY THAT MALE ENROLLMENT IN GRADE', 
31X,I2,1X,'1N» ,IX,14/1H , 'WILL «E LESS THAN THE SPFCIFIEO PREDICTED 

4 ENROLLMENT' , 14X, 'WILL BE GREATER THAN THE SPFCIFIEO PRCOICTEO ENR 
50LLMENTV/1H ,4X,' PR OR ABILITY', 5X, 'PREDICTED ENROLLMENT • , 3CX, 'PROB 
6ABII. ITY* ,5X, 'PREDICTED ENROLLMENT •// ) 

7003 FORMAT I ///// 1 HO , • PROBAB !L ITY THAT FEMALE ENROLLMENT IN GRADE', 

?1X, 12, IX, 'IN* , IX, 14, 13X, 'PROBABILITY THAT FEMALE ENROLLMENT IN GRA 
?DF',1X,I?,1X,'IN',1X,!4/1H ,'WILL BE LESS THAN THE SPECIFIED PREOI 
4CTFD ENROLLMENT', 14X, 'WILL BE GREATER THAN THE SPFCIFIEO PREDICTED 

5 ENROL LMF NT ' / / I H ,4X , • PRClRABI L ITY' ,*>X, 'PREDICTED ENROLLMENT ', 30X, 
6'PPOBABIL I TY • , 5 X , • PREDICTED ENROLLMENT'//) 

WRITE (6,7000) PROMO), FNPROJ(l), PR0BI1I, 

?FN p ROJ( 11), PROM (2 ) , ENPR.OJ(?), PROB(?), ENPR0JI10), PR0BI3I, 

3F N'RROJ ( 3 ) , °RPB ( 3 ) , ENPROJ(R), PR03(4), EN p R()J(4), PROB ( 4 ) , 

4ENPR0J ( M ) , PROB ( 5 ) , EN p R0J(5), PRCR(5), ENPR0J(7), PR0B(6), 

5FNPR0J ( 6 ) , PROB (6 ) , EN p R0J(6), PR0B(7), EN p R0J(7), PR0B(7), 

6ENPR0J (5 ) , PROB ( 8) , ENPR0J(8), PROB(S), ENPR0JC4), PR0B(9), 

7ENPRDJ ( ° ) , PROB 19) , ENPR0J(3), PROft(lO), ENPROJ(IO), PROB(IO), 
8ENPR0J ( 2 ) , PROB (11), ENPROJ(ll), PROB(II), FNPROJ(l) 

7000 FORMAT (1H ,BX,F3.2,11X,F10.0,42X,F3.2,11X,F10.0) 

WRITE (7,5000) (PROJ(LJK), LJK = 1,100) 

5000 FORMAT ( 16F5.0) 

RETURN 

END 



r 
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INSTRUCTIONS TO THE USER 
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The input deck should consist of the following cards: 

I Title card 

Columns 1-80 may be used* The title will be used as a heading 
on the output* 

II Parameter card 



IYEAR 


VARIABLE 

(number of years to be 
simulated) 


FORM 

any integer from 
one to fifteen 


COLUMN ( S ) 
9-10 


IGRADE 


(number of grades to 
be simulated) 


either 12 or 13 


14-15 


ISEX 


(to indicate use of 
sex option) 


1 if yes, 0 if 
no 


20 


I DATE 


(first year to be 
simulated) 


19xx 


22-25 


IX 


(number for random 
number generator) 


any 9 digit 
odd integer 


27-35 



III Format cards for input parameters and variables. 

Use a separate card for each of the 13 formats. Columns 
1-80 may be used. The following formats are suggested, but 
not required. 

Variable (V) or Parameter (P) 



(M) 


ENROLL 






(P2) 


BIRTHS 






(P3) 


DEATHS 






(VI) 


ABIRTH, 


B3IRTH, 


C3IRTH 


( V2 ) 


AGIRLS, 


BGIRLS, 


CGIRLS 


(V3) 


ADEATH, 


BDSATH, 


CDEATH 


(Vlt) 


APREMI , 


BPREMI , 


CPREMI 


(V5) 


AMIGRA, 


RMIGRA , 


CMIGRA 


( V6) 


APRIVS, 


BPRIVS , 


CP HI VS 


(V7) 


ATRANS, 


BTRAXS, 


C TRANS 


(V8) 


AHOLDS, 


BIIOLDS , 


CIIOLDS 


(V9) 


AINSTI, 


BINSTI, 


CINSTI 


(vxo) 


ADROPS, 


BDROPS, 


C DROPS 



Format 


When 


Format When Sox 


Sex Option 


Option Is 


Used 


Is Not 


Used 






(6(F5.0 


*5x) ) 


(12F5.0) 




(5(F5.0 


,5x) ) 


(10F5.0) 




(5(F5*0 


#5x) ) 


(10F5.0) 


F4.3) 


(3F4.0) 




(3F4.0,4x, 


(3F5.0) 




(3F5.0) 


F4.3) 


(3F5.3) 




(3F5.3,lx, 


(3F5.0) 




(3F5.0,lx, 


F4.3) 


(3F3.0) 




(3F3.0,7x, 


F4.3) 


(3F4.0) 




(3F4.0,4x, 


F4.3) 


(3F5oO) 




(3F5.0,lx, 


F4.3) 


(3F3.2) 




(3F3.2,7x, 


F4.3) 


(3F4.2) 


. 


(3F4.2,4x, 


F4.3) 


(3F3.2) 




(3F3.2,7x, 


F4.3) 
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INSTRUCTIONS TO THE USER (continued) 



IV Following the format cards should be input parameter and 

variable data cards. These cards should follow the formats 
on the format cards. The following is a brief explanation 
of the parameters and variables. In the explanation, "base 
year" refers to the year immediately preceding the first 
year of prediction; often it is the year in which the pre- 
diction study is conducted, "Years of simulation" do not 
include the base year. The "previous year" refers to the 
twelve months previoub to the simulation or prediction date, 
a date during the fall semester for which the predictions 
are made. The "year and grade of simulation" are the year 
and grade for which the predictions are being made. 



PARAMETERS 



(l) ENROLL (I) 

I « grade level 



(2) BIRTHS (J) 

J . * year of simulation 



(3) DEATHS (J) 

J * year of simulation 



The number of children in each 
grade of the public schools at 
the beginning of school in the 
base year. 

The number of allocated births 
in each of the four or five 
years previous to the base year, 
depending on whether kinder- 
garten or first grade is tho 
first level to bo predicted. 
This assumes figures for the 
base year are not available 
and must bo predicted. 



Tho numbor of 
births. 



deaths for these 



VARIABLES 



Each of the following variables has three or four forms, 
all of which must be included in the data. The various 
forms are designated by the prefixes "A," "3," "C," and "D," 
For example, ABIRTH refers to the "high" estimate; B3IRTH 
to the “most likely;" and CBIRTH, the "low" estimate, Tho 
optional form, DBIRTII, refers to the estimated proportion 
made for the variable. Other variables follow tho same 
pattern. 

Data should bo ordered so that all forms of the variable 
with the first subscript values are read; then the throe or 
four forms with tho second subscript values are road, and 
so on. When two subscripts arc used, the first subscript 
is tho first to vary, forming the "inside" loop. 
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INSTRUCTIONS TO THE USER (continued) 



(1) ABIRTH (I, J) 

I « age group 
J « year of simulation 
minus four or five 


The number of live births for 
each 1000 females in each of 
six age groups (15-19, 20-24, 
25-29, 30-34, 35-39, and 40- 
44). Values for this variable 
will be required for the base 
year of simulation and all 
other years except the last 
four or five, depending on 
whether kindergarten or first 
grade is the first grade to be 
predicted. 


(2) AGIRLS (I, J) 

I = age group 
J year of simulation 
minus four or five 


The- number of females in each 
of the six age groups for the 
years specified for Variable 1. 


( 3 ) ADEATH ( J ) 

J = year of simulation 


The proportions of preschool 
deaths for the births estimated 
by Variables 1 and 2. 


(4) APREMI (J) 

J * year of simulation 


The net preschool migration in 
numbers of children migrating 
between birth and age of entry 
into kindergarten or first 
grade each year of simulation. 


(5) AMIGRA (I, J) 

X = grade of simulation 
J « year of simulation 


For each year and grade of sim- 
ulation, except the first grade 
level, net migration to the 
public schools during the pre- 
vious year and grade. If there 
is a net gain, the value will be 
positive; if there is a net loss, 
it will be negative. 


(6) APRIVS (J) 

J » year of simulation 


The number of potential kin- 
dergarten children or first 
graders who will enroll in non- 
public school instead of public 
school during each year of sim- 
ulation. 


(7) ATRANS (I, J) 

I * grade of simulation 
J a year of simulation 

• • 


For each year and grade of the 
simulation, net transfers to 
non-public schools during the 
previous year and grade.. A 
loss in the public schools would 
be reflected by a negative not 
transfer and a gain reflected bv 
a positive figure. • 
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INSTRUCTIONS TO THE USER (continued) 



(8) AHOLDS (I, J) 

I *» grade in which 
students remain 
J » year of simulation 



(9) AINSTI (I, J) 

I » grade of simulation 
J « year of simulation 



(10) ADROPS (I, J) 

t « grade of drop- 
out 

J = year of simulation 



For each year of simulation, 
the proportions of students 
in each grade who are retained 
at the end- of the previous 
year and will remain in that 
grade level for another year. 

For each year and grade of 
simulation, the percentage of 
students who are dropped from 
the rolls because of death or 
institutionalization during 
the previous year and grade. 

For each year of simulation, 
the proportion of students who 
dropped out of school from 
grades seven through cloven 
during the previous year. 
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